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EXECUTIVE  SUMMARY 


This  study  was  undertaken  to  Improve  the  utility  of  MIL-HDBK-217  for 
reliability  prediction  of  spacecraft  components  and  systems.  As  part  of  this 
effort  over  3»000  reports  of  anomalous  incidents  affecting  U.  S.  spacecraft 
(plus  a  small  number  of  foreign  spacecraft)  were  analyzed.  Slightly  over 
2»500  of  these  reports  were  sufficiently  detailed  to  permit  assignment  of  the 
failure  to  a  mission  time  and  a  specific  subsystem*  a^id  in  approximately  80% 
of  these  further  analysis  was  possible  to  determine  the  underlying  cause  of 
the  failure  (design*  quality*  etc.)  and  the  specific  part  in  which  the 
failure  originated.  The  data  were  obtained  from  over  300  satellites 
comprising  96  programs  which  were  launched  between  the  early  1960s  through 
January  of  1984. 

A  primary  motivation  for  this  effort  were  earlier  reports  that  indicated  that 
the  hazard  (failure  rate  normalized  with  respect  to  the  surviving  population) 
decreased  with  time  on  orbit.  Reliability  prediction  based  on  MIL-HDBK-217 
assumes  an  exponential  failure  law  which  corresponds  to  constant  hazard.  If 
there  is  strong  evidence  that  hazard  is  indeed  decreasing  this  should  be 
taken  into  account  in  the  reliability  model  in  order  to  permit  realistic 
predictions  and  improved  allocation  of  reliability  resources. 


As  shown  in  Figure  0-1*  this  study  has  produced  very  strong  evidence  for  the 
existence  of  a  decreasing  hazard.  The  cause  for  this  apparent  deviation  from 
conventional  reliability  experience  has  been  traced  to  failures  due  to  design 
and  environmental  causes.  These  occur  with  decreasing  frequeincy  with  time  on 
orbit*  corresponding  to  the  decreasing  probability  of  encountering  an 
environment  that  is  more  stressful  than  a  previously  encountered  ono.  The 
classical  parts,  quality*  and  operational  failures  do  not  deviate 
significantly  from  the  exponential  failure  distribution  after  an  initial 
period  dominated  by  infant  mortality.  From  the  distribution  of  causes  of 
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fiSllure*  shown  in  Figure  0-2»  it  is  seen  that  design  and  environment  together 
account  for  about  45%  of  the  failures*  and  that  parts,  quality,  and  unknown 
causes  together  account  for  about  an  equal  percentage.  (Chapter  2) 

The  study  found  a  significant  difference  in  failure  rates  among  subsystems  as 
shown  in  Figure  0-3  which  can  be  explained  in  terms  of  relative  complexity. 
The  number  of  anomalous  incidents  per  spacecraft  is  higher  in  post  1977 
spaceprograms  than  in  earlier  ones,  but  the  severity  of  failures  is 
significantly  less.  The  increased  complexity  of  recent  satellite  designs 
(many  of  them  multi -miss ion)  accounts  for  the  greater  number  of  failures,  and 
the  higher  redundancy  and  ruggedness  of  the  subsystems  accounts  for  the 
lesser  severity  of  incidents.  (Chapter  3)  Failure  rates  Were  affected  by  the 
mission  type  with  communication  satellites  generally  having  the  lowest 
failure  rates  and  navigation  satellites  having  the  highest  ones.  This  seems 
to  reflect  the  relative  maturity  of  the  technologies  employed  in  the 
satellite  design.  Orbit  altitude  did  not  by  Itself  have  a  major  effect  on 
the  failure  rate,  but  orbit  dependent  equipment  selection  (e.  g.,  the  need 
for  tape  recorders  on  low  altitude  missions)  produced  an  apparent  altitude 
related  effect.  (Chapter  4) 

Based  on  these  observations  a  reliability  prediction  procedure  has  been 
developed  in  which  satellite  reliability  is  composed  of  two  factors  that 
account'  for  mission  and  parts  effects,  respectively.  The  general  model  is 

parts  Mission 

where  the  first  factor  comprises  an  exponential  reliability  prediction  based 
on  MIL-HDBK-217  procedures  while  in  the  second  factor  a  Wei  bull  model  Is  used 
to  account  for  the  decreasing  hazard  associated  with  design  and  environment 
failures.  This  model  is  validated  by  a  comparison  of  predicted  and 
demonstrated  reliability  from  two  spacecraft  programs.  The  new  model  will  In 
general  predict  a  higher  reliability  for  long  mission  durations.  Use  of  this 
model  in  trade-offs  and  design  decisions  will  lead  to  more  realistic 
assessment  of  space  mission  reliability  and  permit  a  better  allocation  of 
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FIGURE  0-3.  DISTRIBUTION  OF  FAILURES  BY  SUBSYSTEM 
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resources  In  satellite  design.  Alternative  models  for  situations  where  the 
parameters  of  the  two-part  model  cannot  be  obtained  are  also  provided 
(Chapter  S) 

Two  alternate  prediction  methods  are  provided.  One  method  is 
applicable  for  the  subsystem  and  component  designer  who  needs  failure  rate 
information  for  part  selection  and  reliability  design  decisions.  Although 
not  as  accurate  as  the  primary  method,  the  procedure  is  simple  to  apply  and 
involves  a  modification  of  the  space  environment  factor  (Sj.)  in  MIL-HDBK- 
217  by  a  factor  of  .5.  Time  dependency  effects  for  the  failure  rate  are 
not  directly  considered  by  use  of  the  Sp  modifier.  However,  current  MIL- 
HDBK-217  methods  tend  to  overestimate  space  environment  failure  rates  and 
use  of  the  Sp  modifier  results  in  overall  improvement  in  predictions.  A 
piecewise  exponential  model  is  also  provided  to  account  for  time 
dependency  effects  when  the  Sp  factor  is  not  modified. 

The  second  method  is  applicable  for  the  mission  planner  and  the  space¬ 
craft  designer  in  those  cases  where  the  prediction  must  be  based  upon 
similar  spacecraft  missions  and  extrapolations  to  longer  mission  durations 
are  necessary.  A  single  term  Weibull  model  is  used  where  the  beta 
parameter  has  been  empirically  determined  to  give  a  workable  fit  to  the 
observed  spacecraft  reliability  data. 
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Chapter  1 


INTRODUCTION 


IJLJ)f?J£Cm£S.  OF  THIS  HERCiil 


Reliability  prediction  fo'*  spacecraft  Is  practiced  on  three  levels 

-  mission  planning  and  vpacecraft  specification 

-  spacecraft  design 

-  spacecraft  subsystem  and  component  design 

The, findings  of  this  report  are  of  interest  at  all  three  levels. 

For  the  mission  planner  Is  Interested  In  determining  the  satellite  lifetime 
which  results  In  the  lowest  cost  per  year*  and  the  prediction  of  the  failure 
rates  Is  obviously  an  Important  input  to  that  analysis.  The  time  dependence 
of  failure  rates  Investigated  In  Chapter  2  and  historical  satellite 
reliability  trends  discussed  In  Chapter  3  respond  to  that  need.  Also,  gross 
mission  failure  rates  and  the  effect  of  subsystem  and  orbit  parameters  on 
component  reliability  which  are  discussed  In  Chapter  4  will  be  of  Importance 
at  that  level.  The  single  term  Welbull  reliability  prediction  model 
described  in  Section  5.3.2  is  particularly  suited  for  mission  planning. 

The  spacecraft  designer  is  faced  with  the  need  for  determining  fault 
tolerance  and  redundancy  requirements  for  subsystems  and  major  components. 
Predicted  subsystem  failure  rates  discussed  In  Chapter  4  are  the  major  data 
input  to  these  decisions.  The  spacecraft  designer  must  also  provide 
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environmental  protection  for  the  equipment*  select  duty  cycles  for  some  of 
the  spacecraft  functions,  and  roust  plan  for  testing  of  the  satellite  as  a 
whole  as  well  as  for  Its  components.  The  analysis  of  causes  of  spacecraft 
failures  presented  In  Chapter  3  will  be  helpful  In  these  decisions.  The 
reliability  prediction  procedures  of  Chapter  5  address  a  direct  need  of  the 
spacecraft  designer. 

The  subsystem  and  component  designer  needs  failure  rate  ^nformatlcn  for  parts 
selection  and  Internal  redundancy  decisions.  The  reliability  prediction 
procedures  found  In  Chapter  5  are  applicable  to  this  environment,  and  the 
piecewise  exponential  model  described  In  Section  5.3.1  may  be  particularly 
suitable.  Causes  of  component  failures  discussed  In  Chapter  3  and  detailed 
analyses  presented  In  Chapter  4  are  also  pertinent  to  the  design  decisions 
made  at  this  level. 


Selected  data  on  satellite  failures  were  transcribed  from  existing  data  ^ases 
(see  following  section)  Into  a  dedicated  data  base  for  this  study  which 
contained  for  each  Incident 

-  Satellite  Program 


-  Flight  Number 


-  Month  and  Year  of  Launch 


-  Failure  Time  (in  months  on  orbit) 


-  Severity  Classification 


Cause  of  Failure  (up  to  three  classifications) 


-  Subsystem  Affected 
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Part  Affected  (where  applicable) 


Subsets  of  the  data  base  could  be  extracted  for  any  combination  of  logical 
and  quantitative  conditions.  The  estimates  of  quantitative  parameters 
presented  In  the  body  of  the  report  were  In  most  cases  derived  by 
multivariate  regression.  Tests  of  hypotheses  were  used  to  support 
qualitative  findings*  such  as  distinctions  between  contributions  to  the 
failure  rate  by  various  causes.  Statistical  aspects  of  the  methodology  are 
discussed  In  Appendix  A. 


The  two  major  data  sources  utilized  In  this  study  were  the  Orbital  Data 
Analysis  Program  (ODAP)  at  The  Aerospace  Corporation  and  the  On-Orbit 
Spacecraft  Reliability  (OOSR)  data  compiled  by  Planning  Research  Corporation 
for  NASA. 

The  study  started  with  an  ODAP  compilation  uS  of  December  1982*  received  a 
major  update  In  June  of  1983*  and  was  finally  brought  up  to  date  as  of  July 
31*  1984  at  which  time  most  failures  that  had  occurred  during  1983  and  a  few 
later  ones  had  been  captured.  Dr.  Max  Weiss*  Dr,  F.  D.  Maxwell*  and  Mr.  Jay 
Leary  were  particularly  helpful  In  furnishing  this  material  and  associated 
documents*  and  by  critqueing  preliminary  findings  that  were  discussed  with 
them. 

The  OOSR  study  was  completed  In  January  1983  and  no  updates  were  obtained 
during  the  conduct  of  the  effort  reported  on  here.  Mr.  Bloomquist  and  Ms.  , 
Graham  were  generous  of  their  time  In  explaining  their  methodology  and  In 
permitting  us  access  to  original  files  to  explore  details  that  were  not 
available  In  the  published  documents. 

Further  details  on  the  data  bases  are  presented  in  Appendix  B. 
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Mr,  Myron  Lipow  and  Mr,  Sam  Lehr  of  TRW  were  very  helpful  by  discussing  their 
methodology  for  spacecraft  reliability  prediction  and  by  furnishing  data 
utilized  In  that  process. 

The  RATC  Project  Engineer  for  the  study.  Hr.  Eugene  Florentine,  provided  much 
constructive  guidance  throughout  the  Investigation.  His  review  cf  the  draft 
of  this  report  helped  us  to  provide  needed  clarifications  and  to  avoid 
Inconsistencies.  The  formulation  of  the  simplified  exponential  approximation 
for  reliability  prediction  In  Section  5.3.1  Is  due  to  His  suggestion. 

We  want  to  express  our  gratitude  for  this  assistance  while  at  the  same  time 
asserting  that  the  conclusions  presented  here  are  exclusively  the 
responsibility  of  the  authors. 


Chapter  2 


TIME  DEPENDENCE  OF  THE  FAILUPf  RATE 


Standard  methods  of  reliability  prediction.  Including  those 
described  In  MIL-STD-756  and  MiL-HDBK-21?,  are  based  on  an 
exponential  failure  rate  assumption.  This  Implies  that  the 
probability  of  failure  over  some  fixed  finite  time  Interval  among 
the  survivors  at  the  beginning  of  that  Interval  Is  constant  and 
Independent  of  prior  service.  Because  of  this  characteristic  the 
exponential  failure  distribution  Is  sometimes  called  "the 
distribution  without  memory".  The  exponential  failure  rate 
assumption  has  been  found  consistent  with  experience  In  many 
terrestrial  electronic  applications,  and  It  leads  to 
mathematically  tractable  reliability  models.  It  has  therefore 
also  been  adopted  for  spacecraft  reliability  prediction. 
Howover,  for  a  number  of  years  there  has  been  evidence  that  space 
applications  experlenc*  a  decreasing  hazard,  and  the  data 
collected  In  the  present  effort  confirm  this  finding. 

This  chapter  first  synopsizes  prior  Investigations  Into  the 
decreasing  hazard  phenomenon,  then  presents  the  results  of  tne 
current  Investigation  and  analyzes  the  possible  processes  that 
can  cause  a  decreasing  hazard,  and  finally  It  discusses  the 
Implications  of  the  decreasing  hazard  for  spacecraft  reliability 
prediction. 


2.1  Historical  Perspective 


Early  evidence  of  decreasing  hazard  can  be  found  In  a  study  of  satellite 
failures  during  the  decade  ending  1970  sponsored  by  the  Navy  Space  Systems 
Office  CBEAN71].  A  particularly  significant  Illustration  from  that  report  Is 
reproduced  In  Figure  2-1.  It  Is  seen  that  the  number  of  anomalous  Incidents 
decreases  much  faster  than  the  number  of  (operational)  spacecraft  In  the 


sample.  A  quantitative  analysis  of  . these  data  for  suc.:esslve  10»000  hour 
periods  Is  presented  In  Table  2-1. 

TABLE  2-1  DECREASING  HAZARD  IN  F./RLY  SPACEOIAFT 


Period  ending 

7  Failures 

Avg.  Operat. 

Hazard 

(hours) 

per  1000  hrs. 

Spacecraft 

(see  note) 

10*000 

74 

96 

0.77 

20*000 

12 

48 

0.25 

30*000 

3 

22 

0.14 

Note:  Hazard  Is  expressed  as  number  of  failures  per  1000  operating 
spacecraft-hours 

Investigation  of  this  phenomenon  was  not  a  specific  objective  of  the 
referenced  report  and  it  Is  not  further  commented  on  (Table  2-1  was  compiled 
as  part  of  the  current  Investigation).  However^  a  few  years  later 
researchers  at  NASA  Goddard  addressed  the  constant  hazard  assumption  and 
found  that  "it  does  not  occur  until  90  (or  probably  more)  days  In  space” 
tTIf«W753.  That  study  also  Introduced  normalized  failure  rates  (dividing  the 
observed  failures  during  a  given  period  by  the  number  of  spacecraft 
contributing  to  the  observations).  This  technique  Is  continued  In  the 
present  Investigation  and  the  term  failure  ratio  Is  used  for  the  failure  rate 
that  Is  normalized  In  this  manner.  The  failure  ratio  Is  used  as  an 
approximation  for  the  hazard  (for  definitions  of  hazard  see  CLL0Y77»  p.  135] 
or  [VANA64,  p.  611;  'hazard  function'  or  the  shorter  'hazard'  used  In  the 
former  reference  seems  preferable  to  'hazard  rate'  used  In  the  latter). 

The  normalized  malfunction  rate  computed  In  the  NASA  Goddard  study  Is 
illustrated  In  Figure  2-2.  The  definitions  used  In  connection  with  this 
figure  are 

Failure  the  loss  of  operation  of  any  function*  part,  component* 

or  subsystem*  whether  or  not  redundancy  permitted 
recovery  of  operation 

Problem  any  substandard  performance  or  partial  loss  of  function 

which  is  not  sufficient  to  be  classed  as  a  failure 
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FIGURE  2-2.  TIME  DISTRIBUTION  OF  NORMALIZED  SPACE  MALFUNCTIONS 

FROM  57  GSFC  SPACECRAFT 


In  a  later  publication  CN0RR76]  the  same  research  group  fitted  Duane  and 
Wei  bull  models  to  their  results  and  found  a  decreasit.g  hazard  function  over 
the  entire  time  span  covered  by  the  data  (roughly  three  years).  They  also 
found  a  very  good  fit  to  the  Duane  model  at  the  component  failure  level  as 
shown  In  Figure  2-3  (examples  of  components  are  tape  recorders  and 
transmitters).  Excess  failures  observed  during  the  very  early  life, 
specifically  during  the  first  30  days  on  orbit,  were  found  to  be  related  to 
Inadequacies  of  spacecraft  and  component  testing.  No  other  explanation  for 
the  decreasing  hazard  Is  offered  In  these  reports. 

The  NASA  Goddard  studies  as  well  as  all  others  discussed  In  this  chapter 
counted  as  a  malfunction  any  observation  of  nonconforming  behavior,  whether 
it  occurred  In  a  spare  or  1h  an  active  unit.  Therefore,  the  entire 
spacecraft  equipment  can  be  modeled  as  being  In  a  series  configuration  for 
evaluating  the  failure  rate.  If  the  exponential  failure  law  applies  at  the 
component  or  lower  level,  the  total  failures  observed  should  therefore  also 
follow  the  exponential  distribution. 

An  update  of  the  Navy  Space  Systems  study  prepared  In  1978  showed  further 
evidence  of  decreasing  hazard  CBL0078J.  That  report  Includes  many  spacecraft 
with  lifetimes  In  excess  of  three  years,  and  further  decreases  In  hazard  are 
Implied  for  these.  Excerpts  from  Exhibit  3  of  the  reference  are  shown  In 
Table  2-2.  Each  row  summarizes  the  data  for  the  first  10  spacecraft  that 
exceed  the  lifetime  shown  In  the  first  column;  In  most  cases  the  longest 
lifetime  included  is  within  2,000  hours  of  the  threshold.  The  hazard  Is  an 
average  value  because  the  reference  does  not  provide  Incremental  data. 

TABLE  2  -  2  HAZARD  EXPERIENCE  IN  19T8  REPORT 


Spacecraft 
L 1  •e 
(Hours) 


Hazard 

Fal’ures  per  1000 
Spacecraft-Hours 


■  ,4,000 
8,000 
16,000 
32,000 


1.20 

0.60 

0.48 

0.27 
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At  the  time  that  the  report  became  available  a  number  of  Air  Fore  satellite 
programs  that  could  benefit  from  long  mission  times  (e.  g.»  communication 

and  navigation  programs)  were  In  the  early  Implementation  phase.  It  was 
realized  that  these  satellites  could  be  designed  In  a  more  economical  manner 
If  advantage  were  taken  of  the  lower  hazard  at  prolonged  oh-orbit  periods  but 
because  no  clear  cause  for  the  decreasing  hazard  phenomenon  could  be 
Identified  It  was  decided  to  stay  with  the  exponential  failure  rate 

assumption  as  a  "conservative**  approximation  of  the  true  reliability 
function.  However#  a  technical  need  for  Improved  knowledge  of  spacecraft 
electronic  failure  rates  was  recognized#  and  this  need  Is  addressed  In  the 

present  study.  The  findings  and  analysis  of  this  part  of  the  study  will  be 

found  In  the  Immediately  following  sections.  The  'Implications  of  the 

decreasing  hazard  for  various  aspects  of  spacecraft  reliability  prediction 
are  discussed  In  the  final  section  of  this  chapter. 


2.2  Decreasing  Hazard. FJadJjgs 

As  shown  In  Figure  2-4  the  failure  ratio  (defined  as  an  approximation  of 
hazard  In  the  previous  section)  decreases  throughout  the  satellite  life  with 
the  greatest  decrease  during  the  first  three  years.  During  the  second  year 
the  failure  ratio  Is  approximately  one-half  of  the  average  for  the  first  year 
(and  slightly  over  one-third  of  the  average  for  the  first  six  months).  At 
the  end  of  the  third  year  It  has  decreased  to  about  one-third  •of  Its  average 
value  during  the  first  year#  and  at  the  end  of  eight  years  It  Is  down  to 
about  one-tenth  of  the  failure  ratio  during  the  first  six  months.  This  has 
very  significant  implications  on  the  mission  planning  and  redundancy 
provisions  as  shown  In  the  last  section  of  this  chapter.  However,  before 
this  finding  can  be  accepted  at  face  value  a  number  of  possible  objections 
must  be  resolved.  Two  factors  may  cause  the  observed  failure  ratio  tc 
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klecrease  while  the  true  failure  ratio  remains  constant 

-  shadowing  and 

-  decreased  user  intei'est  or  funding 

Shadowing  designates  the  loss  of  obsorvabll Ity  for  parts  that  are  associated 
with  a  failed  component.  As  an  example  of  this  process  consider  a  failure  In 
a  tape  recorder  or  multiplexor#  components  for  which  most  satellites  carry 
spares.  As  soon  as  a  disabling  failure  In  the  primary  or  active  unit  occurs. 
It  Is  switched  out  and  the  spare  unit  Is  activated.  Because  no  further  use 
is  made  of  the  original  unit#  subsequent  parts  failures  will  not  be 
detected.  Even  more  significant  can  be  the  termination  of  an  entire  mission 
package#  such  as  the  cessation  of  all  optical  weather  observations  when  a 
vldlcon  falls.  There  Is  no  doubt  that  the  reports  used  In  Figure  2-4  are 
aifected  by  shadowing  but  It  Is  not  believed  to  account  for  a  significant 
part  of  the  decrease  In  hazard  because 

-  fho  data  preser.ted  In  Figure  2-3?  which  are  on  a  component  basis  and 
therefore  not  subject  to  shadowing.  The  source  for  these  data  computed 
a  Wei  bull  shape  parameter  (b  In  the  notation  used  In  the  present  report) 
of  0.311#  Indicative  of  a  decreasing  hazard 

-  Failures  of  severity  that  disable  components  but  not  an  entire  subsystem 
(severity  classifications  2  and  3)  occur  at  the  rate  of  approximately 
0.5  per  spacecraft-year  between  the  second  and  eighth  year  on  orbit. 
The  average  component  population  under  observation  Is  at  least  65  (this 
figure  Is  given  In  CN0RR76]  for  the  comparatively  simple  satellites 
launched  prior  to  1970).  Thus#  the  decrease  in  hazard  accounted  fo"  by 
shadowing  Is  less  than  1%  per  year  whereas  the  decrease  in  hazard  shown 
Tn  Figure  2-4  Is  over  10%  per  year  between  the  second  and  eighth  year. 
Failures  that  disable  a  major  subsystem  but  not  the  entire  satellite 
occur  at  a  rate  less  than  0.1  per  year.  Assuming  that  such  a  failure 
will  remove  five  components  from  observation#  the  effect  Is  comparable 
to  or  less  than  that  due  to  cofnponent  failures. 
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The  lack  of  Interest  or  funding  as  a  satellite  operates  past  the  Initially 
planned  period  may  cause  failures  to  go  undetected  or  unreported,  thereby 
creating  the  Illusion  of  a  decreasing  hazard.  The  lower  rate  of  reporting  Is 
especially  likely  to  affect  minor  discrepancies,  transient  failures,  and 
conditions  which  could  be  easily  corrected  by  operational  procedures.  The 
ratio  of  minor  malfunction?  reported  to  the  total  failures  Is  therefore 
expected  to  decrease  If  there  Is  systematic  underreporting  of  the  former  for 
longer  mission  durations.  There  Is  some  evidence  of  this  effect  In  the  data 
as  discussed  below. 

The  databise  used  In  this  study  classified  critical Ity  as  follows: 

1.  Mission  critical 

2.  Single  point  failure  (affecting  a  major  subsystem) 

3.  Reciundant  unit 

4.  Work  around 

5.  Degraded  performance 

6.  Temporary 

7.  AH  others 

Classifications  4-7  are  In  the  following  grouped  together  as  low 
criticality  failures.  The  observed  ratio  of  low  criticality  failures  to  all 
failures  shoi^rn  In  Figure  2-5  is  almost  constant  for  the  first  five  years  on 
orbit  and  exhibits  a  slightly  decreasing  trend  thereafter  at  the  rate  of 
about  3%  per  year.  Since  failures  of  low  criticality  comprise  initially 
somewhat  less  than  two-thirds  of  the  total,  this  effect  translates  to 
underreporting  at  a  2%  per  year  rate  for  the  total  population.  This  can 
account  for  some  but  by  no  means  all  of  the  decreasing  hazard  observed  after 
the  first  five  years  on  orbit. 
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FIGURE  2-5.  DECREASING  REPORTING  TREND 


A  final  reservation  about  acceptance  of  decreasing  hazard  arises  from  the 
Incompatibility  of  such  a  characteristic  with  the  established  and  observed 
failure  patterns  of  electronic  parts.  It  will  shortly  be  seen,  however,  that 
conventional  parts  failures  account  for  Oiily  a, fraction  of  the  total  failures 
that  affect  spacecraft  In  orbit,  and  that  other  causes  of  failure  are 
compatible  with  a  decreasing  hazard. 

That  many  spacecraft  systems  employ  redundancy  does  not  affect  the 
conclusions  presented  here  since  failures  In  all  equipments  (active  and 
standby)  were  monitored  and  reported.  As  far  a  failure  reporting  Is 
concerned,  a  simple  series  model  of  all  equipments  can  therefore  be  assumed. 

Wei  bull  hazard  plots  were  fitted  to  the  observed  failure  ratios  by  a  least 
squares  method  that  has  been  In  use  for  many  years  CKA056].  The  form  of  the 
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Wei  bull  hazard  function  used  here  Is 


2(t) 


1 


where  b  (beta  Is  used  In  most  texts)  Is  the  shape  parameter  and  a  (or  alpha) 
Is  the  scale  parameter.  The  corresponding  reliability  function  Is 

R(t)  “  exp  (-t*’/a) 

The  best  fit  for  the  total  failure  population  Is  obtained  for  a  =  255  hours 
and  b  =  0.28  and  the  curve  shown  In  Figure  2-4  represents  this  relation.  The 
methodology  for  fitting  a  Wei  bull  hazard  function  to  the  failure  ratio  data 
Is  also  applied  to  subsets  of  the  failure  data.  The  fits  are  not  always  as 
good  as  that  discussed  above.  The  reason  Is  not  only  that  the  subsets  have 
smaller  populations  and  that  greater  dispersions  therefore  have  to  be 

expected  but  also  that  some  failure  processes  seem  to  follow  another 

distribution.  Nevertheless,  the  Welbull  fit  was  used  as  a  standard  procedure 

because 

-  It  fitted  the  majority  of  the  failure  populations  quite  well 

-  It  Is  widely  used  In  other  reliability  prediction  literature 

-  the  Welbull  parameters  permit  a  concise  quantitative  comparison  of 
Individual  populations. 

In  some  practical  applications  of  reliability  prediction  other  mathematical 
representations  of  the  time  dependence  of  the  failure  ratio  may  be 

preferable,  and  alternative  procedures  discussed  In  Chapter  5  address  that 
need . 
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The  most  likely  sources  of  the  decreasing  hazard  observed  for  the  overall 
failure  population  are  failures  due  to  design  and  environmental  causes.  To 
explore  this  Important  Issue  It  will  be  necessary  to  examine  causes  of 
failures  briefly  here,  while  a  more  detailed  discussion.  Including 
definitions  of  the  categories.  Is  deferred  until  Chapter  3. 

Causes  cf  failure  were  grouped  under  seven  major  headings: 

-  Design 

-  Environment 

-  Parts 

-  Quality 

-  Operational 

-  Other  known  causes 

-  Unknown 

The  distribution  of  failures  among  these  classifications  Is  shown  In  Figure 
2-6.  Failures  caused  by  design  show  a  consistently  decreasing  failure  ratio 
as  Illustrated  In  Figure  2-7.  Note  particul arly  that  the  failure  ratio  for 
the  eighth  year  and  later  Is  less  than  5%  of  that  observed  during  the  first 
six  months,  and  that  It  Is  approximately  one-half  of  that  reported  at  the  end 
of  the  fifth  year  on  orbit.  The  failure  ratio  for  envl rorenental  causes, 
shown  In  figure  2-8,  exhibits  approximately  similar  tendencies,  though  the 
dispersions  are  greater. 
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YEAR  IN  ORBIT  (ENDING) 


FIGURE  2-8.  TIME  TREND  OF  FAILURES  CAUSED  BY  ENVIRONMENT 
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YEAR  IN  ORBIT  (ENDING) 


FIGURE  2-9.  TIME  TREND  OF  FAILURES  CAUSED  BY  PARTS  +  QUALITY 
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In  constrast»  the  failure  ratio  associated  with  parts  and  quality  causes^ 
shown  In  Figure  2-9»  provides  only  a  small  decrease  beyond  the  end  of  the 
third  year.  The  rate  of  this  decrease  Is  only  slightly  more  than  can  be 
accounted  for  by  the  shadowing  and  loss  of  Interest  effects  discussed  In  the 
previous  section.  Thus*  for  failures  attributable  to  parts  there  does  Indeed 
appear  to  be  a  constant  hazard  region  after  an  Initial  period  of  sharply 
decreasing  failures.  The  nature  of  that  Initial  period  Is  discussed  later  In 
this  section.  Failures  due  to  operational  causes  come  closest  to  a  constant 
hazard  of  all  the  categories  considered  here.  These  are  Illustrated  in 
Figure  2-10.  Miscellaneous  other  known  causes  follow  a  similar  pattern. 

Failures  due  to  unknown  causes*  Illustrated  In  Figure  2-11*  show  an  Overall 
hazard  pattern  that  Is  consistent  with  that  found  for  parts  and  quality 
causes*  but  there  Is  evidence  of  a  continuing  decrease  through  the  eighth 
year.  As  demonstrated  In  Figure  2-8*  unknown  causes  are  a  significant 
contributor  to  the  total  failure  ratio.  The  shape  of  the  failure  ratio  plot 
suggests  that  there  Is  a  greater  fraction  of  parts  related  failures  In  that 
category  than  design  related  failures.  The  Wei  bull  coefficients  for  the 
total  failure  population  and  for  Individual  cause  classifications  are  shown 
In  Table  2-3.  The  parameter  designated  a  Is  a  scale  factor*  similar  to  MTBF 
for  the  exponential  distribution.  The  b  parameter  Is  ■'•he  shape  factor  which 
determines  whether  there  Is  a  decreasing*  constant*  or  Increasing  hazard. 
Values  of  b  less  than  unity  correspond  to  a  decreasing  hazard*  while  the 
exponential  distribution  can  b?  represented  as  a  special  case  of  the  Wei  bull 
with  b  =  1.  As  Is  seen  In  the  table*  design  and  environment  show  the  sharpest 
deviation  from  the  constant  hazard  condition,  while  operational  and  other 
known  failures  show  the  closest  approximation  to  It. 
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YEAR  IN  ORBIT  (ENDING) 

FIGURE  2-10.  TIME  TREND  OF  FAILURES  CAUSED  BY  OPERATION 
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YEAR  IN  ORBIT  (ENDING) 

FIGURE  2-11.  TIME  TREND  OF  FAILURES  DUE  TO  UNKNOWN  CAUSES 
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TABLE  2-3  WEI8ULL  PARAMETERS  BY  CAUSES  OF  FAILURE 


'it , 


€ 


Cause  Classification  Wei  bull  Parameters 

a  (10®Hrs)  b 

All  causes  0.000255  0.28 

Design  0.000036  0.06 

Environment  0.000047  0.0? 

Parts  and  Quality  .  0.001035  0.28 

Operational  0.113796  0.51 

Other  known  0.115081  0.57 

Unknown  0.002156  0.32 


The  findings  presented  thus  far  have  Identified  failures  due  to  design  and 
environmental  causes  as  the  most  significant  factor  In  producing  a  decreasing 
hazard  beyond  the  Initial  break-in  period  of  the  satellites.  However*  this 
rims  counter  to  the  conventional  assumption  that  design  failures  will  become 
manifest  very  early  In  the  operational  life  of  a  component  and  that  a  period 
of  successful  operation  of  several  years  should  virtually  preclude  that  any 
furtliar  design  failures  will  occur. 

For  an  understanding  of  this  phenomenon  It  Is  Instructive  to  turn  to  the 
stress-strength  concept  of  reliability  that  was  Initially  developed  for 
mechanical  structures  like  bridges  CFREU45]  but  has  also  been  found 
applicable  to  electronic  and  electromechanical  equipment  CLUSS57,  KECE64]. 
The  basic  relationships  for  determining  the  failure  probability  according  to 
this  approach  are  shown  In  Figure  2-12,  The  upper  part  of  the  figure 
Illustrates  the  relation  between  a  constant  load  and  variable  strength*  such 
as  might  apply  to  the  failure  probability  (due  to  dielectric  breakdown)  of  a 
capacitor  connected  across  the  output  of  a  constant  voltage  power  supply. 
The  dielectric  strength  of  the  capacitors  Is  assumed  to  be  a  random  variable 
whose  distribution  Is  determined  by  the  material  and  process  attributes.  By 
standard  design  practices  the  average  value  of  the  strength  Is  placed  well 
above  the  deterministic  level  of  the  applied  load  (the  rated  output  voltage 
of  the  power  supply).  Due  to  the  variable  nature  of  the  strength  a  small 
fraction  of  the  product,  given  by  the  value  of  the  strength  distribution  at 
Xj^*  will  fall.  These  failures  will  occur  almost  Immediately  after  the  power 
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supplies  that  Incorporate  the  low  strength  capacitors  have  been  placed  into 
use. 

The  lower  part  of  the  figure  represents  the  case  when  both  the  load  and  the 
strength  are  variable.  The  load  curve  represents  the  probability  that  the 
load  will  exceed  the  abscissa  value  x,  whereas  the  strength  curve*  as  before* 
Is  the  probability  that  the  strength  will  exceed  the  value  of  x.  This 
Illustration  will  apply  where  the  capacitor  Is  placed  across  an  unregulated 
power  supply,  the  output  vol  tage  of  which  varies  as  a  function  of  the  line 
voltage  and  of  load  fluctuations.  Although  the  average  value  of  load  Is  the 
same  as  In  the  previous  example*  It  Is  Intuitively  seen  that  a  greater 
fraction  of  the  product  will  fall.  The  value  of  the  failure  probability  In 
this  case  must  be  computed  by  a  convolution  Integral  [PAP065]  but  this 
procedure  Is  not  necessary  for  the  understanding  of  the  long  term  decreasing 
failure  rate.  Instead*  the  focus  Is  on  the  time  of  occurrence  of  the 
failures 

Returning  to  the  example  of  the  power  supply  capacitor*  the  Initial  failure 
rate  for  the  unregulated  supply  may  not  differ  markedly  from  that  of  the 
regulated  supply.  However,  whereas  In  the  former  case  no  failures  were 
expected  after  the  Initial  period*  there  Is  clearly  a  mechanism  for 
continuing  occurrence  of  failures  under  variable  load.  The  probability  that 
the  output  voltage  will  exceed  some  value*  y*  above  the  nominal  level  during 
the  first  hour  of  operation  may  be  extremely  small  but  the  probability  of 
that  value  being  exceeded  over  a  period  of  one  year  will  certainly  be 
greater.  The  capacitors  with  dielectric  strength  between  the  nominal  output 
voltage  and  y  will  fall  when  that  exceedance  occurs,  and  therefore  failures 
must  be  expected  during  the  entire  period  of  operation. 

The  Investigation  of  the  occurrence  of  unusually  large  or  small  values  of  a 
random  variable  was  pioneered  by  E.  J.  Gumbel  and  Is  called  statistics  of 
extremes  after  the  title  of  his  definitive  work  In  that  field  CGUMB58].  It 
deals  with  phenomena  for  which  no  firm  upper  (or  lower)  limit  can  be 
established,  such  as  the  discharge  volume  of  a  river  (an  early  application 
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was  In  the  Investigation  of  floods)*  the  lifespan  of  man,  or,  of  particular 
Interest  to  spacecraft  failures,  the  Intensity  of  magnetic  fields  caused  by 
solar  storms.  A  very  terse  description  of  the  central  problem  of  the 
statistics  of  extremes  was  made  before  the  discipline  became  established; 
"However  big  floods  get,  there  will  always  be  a  bigger  one  coming"  CPRES50]. 
In  terms  of  spacecraft  reliability,  that  the  equipment  has  survived  under  the 
environmental  stresses  experienced  during  a  period  of  m  years  on  orbit  does 
not  preclude  the  occurrence  of  a  phenomenon  during  year  m  +  1  that  produces  a 
greater  stress  and  hence  lends  to  failure.  However,  the  likelihood  that 
greater  stresses  will  be  encountered  decreases  over  successive  Intervals,  and 
that  leads  to  the  decreasing  hazard.  A  brief  numerical  exposure  to  the 
methodology  Is  presented  below. 

The  probability  density  of  the  largest  value  of  n  =  1..10  samples  drawn  from 
a  standardized  normal  distribution  Is  shown  In  Figure  2-13  which  Is  taken 
from  Gumbel's  book.  For  n  ■“  1  the  density  of  the  sample  Is  of  course  equal 
to  that  of  the  parent  distribution.  For  a  sample  of  two,  the  mode  for  the 
largest  value  Is  approximately  0.5  standard  deviations  above  the  mean  of 
parent  distribution,  but  then  It  takes  a  sample  size  of  5  to  move  the  mode  to 
1  standard  deviation  above  the  parent  mean,  and  even  at  n  =  10  It  Is  only  at 
1.3  standard  deviations.  (This  discussion  has  centered  on  the  mode,  the 
highest  point  on  each  of  the  curves,  because  It  Is  the  easiest  characteristic 
to  point  out;  except  for  n  =  1,  the  mean  and  median  of  the  extreme  value 
distribution  are  not  exactly  equal  to  the  mode.) 

In  terms  of  spacecraft  reliability,  each  year  of  operation  can  be  equated  to 
one  observation  on  the  basis  that  many  of  the  stresses  are  seasonal  (other 
Interpretations  are  of  course  also  possible).  Table  2-4  lists  the 
probability  of  exceeding  a  previously  observed  stress  level  during  a  given 
year  on  orbit  under  the  above  assumption.  The  data  are  based  on  median 
values  of  the  extremes  for  normal  variates  taken  from  Graph  4. 2. 2 (2)  In 
CGUMB58]. 
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Failures  per  Spacecraft-Year 


TABLE  2-4  PROBABILITY  OF  EXCEEDING  A  PREVIOUSLY  OBSERVED  VALUE 


No.  of 

Median  Extr. 

Increment 

Increment/ 

Years 

Value* , 

over  Prev,  Val* 

Year* 

2 

0.5 

0.5 

0.5 

4 

1.0 

0.5 

0.25 

6 

1.25 

0.25 

0.125 

6 

1.4 

0.15 

0.075 

10 

,1.5 

0.1 

0.05 

*  In  multiples  of  standard  deviations 


The  Increment  values  are  plotted  together  with  the  time  trend  of 
environmentally  caused  failures  (Fig.  2-8)  In  Figure  2-14, 
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FIGURE  2-14  EXTREME  VALUES  FITTED  TO  ENVIRONMENTAL  FAILURES 
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For  this  comparison  Is  was  assumed  that  the  median  of  the  normal  dlsti-lbution 
from  which  the  extreme  values  wore  derived  was  a  stress  that  caused  1  failure 
per  spacecraft-year  and  that  each  exceedance  of  this  stress  by  one  standard 
deviation  also  caused  1  failure  per  spacecraft  year.  Both  of  these 
assignments  are  arbitrary  and  wore  made  to  produce  a  reasonable  fit  to  the 
curve  In  a  simple  manner  (better  results  could  have  been  obtained  by  curve 
fitting  techniques);  That  exceedance  of  the  stress  by  one  standard  deviation 
causes  1  failure  per  spacecraft  year  can  be  Interpreted  In  two  ways 

-  Soacecraft  equipment  strength  Is  uniformly  distributed  so  that  for  each 
unit  Increase  In  stress  the  same  fraction  of  failures  will  be 
encountered 

-  Spacecraft  equipment  strength  is  normally  distributed,  and  the  normal 
distribution  from  which  the  extreme  values  were  drawn  was  obtained  as 
the  convolution  of  a  normally  distributed  environment  variable  and  the 
normally  distributed  strength  variable  (the  probability  of  failure  of  a 
given  system  is  under  these  conditions  normally  distributed,  and  the 
probability  of  system  failure  over  a  number  of  years  or  for  a  number  of 
systems  will  follow  the  extreme  value  distribution). 

This  brief  excursion  Into  the  fie  ,  of  stat1sti<'<5  of  extremes  has  thus 
provided  a  rationale  for  experiencing  a  long  ten.,  decreasing  hazard  for 
failures  associated  with  the  Intensity  of  natural  phenomena. 

It  remains  to  be  explained  why  the  time  trends  for  failures  due  to  parts  and 
quality  causes,  for  which  a  constant  hazard  Is  postulated  In  the  out  years 
still  shows  a  pronounced  decreasing  trend  during  the  Initial  two  years. 
Several  causes  are  probably  responsible  for  this 

-  parts  defects  that  were  not  properly  eliminated  by  test  ~  these  defects 

need  not  cause  immediate  failure  In  the  post-launch  environment  because 
(a)  many  spacecraft  components  do  not  become  operational  until  sometime 
after  orbit  is  achieved,  and  (b)  the  failures  occur  only  at  elevated 
stress  levels  (an  application  of  the  statistics  of  extremes  on  a  smaller 
scale)  . 
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-  parts  failures  due  to  unrecognized  design  deficiencies,  either  In  the 
parts  themselves  or  In  portions  of  the  equipment  that  cause  overloads  or 
otherwise  Induce  the  observed  failures 

-  underestimation  of  the  shadowing  and  loss  of  Interest  effects. 

The  first  of  these  factors  Is  Identified  as  the  most  significant  one,  both  In 
terms  of  the  number  of  failures  caused,  and  also  because  It  Is  the  one  most 
under  control  of  project  management  CTIMM75].  This  aspect  of  the  time  trend 
of  spacecraft  failures  Is  also  closely  related  to  the  employment  and 
effectiveness  of  screening  techniques,  as  subject  that  Is  receiving 
Increasing  attention  In  the  reliability  literature  CSAAR82]. 

For  reliability  prediction  at  the  spacecraft  level  a  single  Welbull  model, 
such  as  the  one  shown  In  Figure  2-4,  will  be  quite  suitable.  For  reliability 
prediction  at  the  subsystem  and  lower  1«  /els  It  Is  necessary  to  distinguish 
between  the  two  contributions  to  failure  probability  as  Is  explicit  In  the 
procedures  described  In  Chapter  5.  Examples  of  the  application  of  these 
findings  are  presented  In  the  next  section. 


2.4  Exampl es  of  Applications 

The  confirmation  of  the  decreasing  hazard  phenomenon  and  the  formulation  of  a 
Welbull  model  for  reliability  prediction  Is  not  merely  of  ,  theoretical 
Interest.  The  following  examples  show  that  significant  decisions  In  mission 
planning  and  spacecraft  design  can  be  affected  by  the  acceptance  of  a 
decreasing  hazard  model.  In  other  areas,  the  distinction  between  random 
(parts  and  quality)  and  correlated  (design  and  environmental)  failures  may 
affect  reliability  related  design  decisions. 

The  examples  presented  here  are  necessarily  simplified  and  the  parameters  are 

-  32  - 


selected  to  emphasize  the  difference  between  the  constant  and  decreasing 
hazard  assumptions.  In  a  practical  case  the  effects  may  be  less  than  In 
these  examples  but  they  are  usually  quite  significant.  Additional  research 
In  this  area  will  therefore  be  found  beneficial. 

2.4.1  Mission  Planning 

A  satellite  mission  may  terminate  for  one  or  more  of  the  following  reasons: 

“  catastrophic  failure 

-  exhaustion  of  consumables  such  as  attitude  control  gas  or  propellants 
for  orbit  maintenance;  the  degradation  of  solar  cells  Is  a  related  Item 
because  Is  requires  allocation  of  additional  capacity  to  sustain  a  long 
life  on  orbit 

-  technological  obsolescence 

The  latter  factor  does  not  usually  enter  Into  the  detailed  trade-off 
decisions  but  It  sets  a  time  horizon  beyond  which  benefits  In  the  other  areas 
are  Immaterial.  Trade-offs  between  reliability  (failure  prevention)  and 
consumables  are  necessary  because  both  make  demands  on  the  same  resources 
(funding  and  satellite  weight).  It  Is  Intuitively  seen  that  It  may  be 
Inefficient  to  provide  consumables  for  more  than  10  years  when  the  predicted 
rel1ab1ir:y  of  the  prime  mission  equipment  Is  very  low  at  that  point  In 
time.  Conversely,  a  reliability  Improvement  to  extend  the  satellite  MTBF  to 
eight  years  may  not  be  warranted  If  consumables  are  provided  for  only  five 
years. 

The  following  example  1s  a  simplified  mission  planning  Investigation  that 
highlights  the  effect  that  the  choice  of  the  failure  d^str1but1on  can  have  on 
optimum  mission  duration.  It  Is  assumed  that  spacecraft  equipment  design  Is 
fixed  and  that  a  reliability  estimate  at  the  2  year  point  is  0.67.  The 
spacecraft  equipment  configuration  Is  modeled  as  two  Independent  redundant 
strings  of  reliability  R,  so  that  the  spacecraft  reliability  becomes 
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Rg  *  1  -  (1  -  R)^ 

Consumables  are  to  be  provided  until  the  time  when  the  reliability  drops 
to  0.4.  In  the  first  data  column  of  Table  2-5  the  time  for  R^  to  reach  p.4  Is 
computed  under  the  exponential  assumption  and  In  the  second  data  cblumn  It  Is 
computed  using  the  Welbull  distribution  for  the  shape  factor  of  0.28  which 
was  found  to  give  a  good  fit  to  the  total  failure  population  In  our  sample. 

TABLE  2-5  MISSION  TIME  FOR  A  SPECIFIED  RELIABILITY 
Parameter  Exponential  Welbull 


Failure  probability  at  2  yrs. 

0.33 

0.33 

Failure  prob.  equation  at  2  yrs. 

(1  -  e-^h^ 

(1  -  e‘1-21/^2 

Evaluation  of  parameter  (L  or  a) 

0.43 

1.43 

Failure  probability  at  x  years 

0.6 

0.6 

Failure  prob.  equation  at  x  yrs. 

{l-e“*^3xj2 

{l-exp(-x*^®/1.43 

Value  of  X  , 

3.5  years 

11  years 

The  time  for  which  consumables  are  to  be  provided  Is  much  longer  for  Welbull 
than  for  the  exponential  assumption.  To  determine  the  potential  benefit  of 
this  longer  life  to  the  mission  planner  assume  that  the  mission  value,  V»  Is 
given  by 

V  =  V  »  T»  -  C(T) 

where  T  =  nominal  mission  time  (to  exhaustion  of  consumables) 

T'=  effective  mission  time  or  mean  mission  duration^ 


1.  This  Is  equivalent  to  the  MMD  truncated  at  the  depletion  of  expendables  as 
defined  In  MIL-STD-1543(USAF)  "Reliability  Program  Requirements  for  Space  and 
Missile  Systems" 
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V  ■  value  to  the  user  per  actual  year  on  orbit 


and  C(T}  *  cost  of  providing  consumable  for  duration  T. 

!•  win  be  approximated  by  (1  +  R(T))*T/2.  Since  the  mission  termination  Is 
defined  by  R(T)  ■  0.4»  the  expression  for  T’  simplifies  to  0.7*T.  For  C(T) 
assume  0.05*s*T  where  s  Is  the  basic  spacecraft  cost.  Also,  assume  that  s/v 
*  S  (this  means  that  the  effective  mission  time  must  be  at  least  three  years 
before  the  program  becomes  economically  justified.  The  following  data  are 
required  to  ccr.pute  the  mission  value,  V. 


Nominal  mission  duration,  T 
Effective  mission  time,  T’ 

Value  In  terms  of  satellite  cosi 
Cost  of  consumables,  C 
Value  excl.  satellite  cost,  V 
Net  mission  value 
*  making  use  of  the  relation  v  =  s/3. 


lonentlal 

Welbull 

3.5 

11 

2.6 

7.7 

0.87s 

2.55s 

0.18s 

0.55s 

0.69s 

2.00s 

0,316s 

1.00s 

% 


It  Is  seen  that  a  mission  that  had  at  a  submarginal  value  under  the 
exponential  assumptions  became  soundly  effective  when  the  Welbull 
distribution  was  used. 

2.4.2  Subsystem  Design 


3 


A  subsystem  consists  of  three  components  that  have  the  following  mission 
reliability  (for  5  years)  and  weight 


Component 


A 

B 

C 


Reliability 
at  5  yrs 

0.90 

0.80 

0.70 
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Weight 

lbs. 

100 

200 

300 


It  is  required  that  the  entire  subsystem  reliability  at  'j  years  be  at  least 
0.70.  The  5  year  component  reliabilities  were  computed  under  the  exponential 
assumption.  As  in  the  previous  case>  the  reliability  prediction  for  a  2  year 
mission  duration  was  the  best  validated  data  point  and  any  Weibull  model  must 
be  tied  to  the  same  2  year  values. 


For  the  exponential  assumption  the  subsystem  reliability  requirement  can  be 
met  by 

-  The  entire  subsystem  can  be  made  redundant#  requiring  only  a  single 
reconfiguration  provision,  but  incurring  a  weight  penalty  of  600  Tbs. 
The  resultant  reliability  will  be  0.75,  neglecting  the  failure 
probability  of  the  switching  circuits. 


-  Individual  components  can  be  made  redundant,  each  with  its  own 
reconfiguration  provisions.  The  minimum  weight  system  that  meets  the 

requirements  uses  redundancy  for  A  and  C,  with  a  reliability  of  0.72, 
again  without  allowance  for  failures  in  the  switching  provisions.  The 
weight  penalty  is  400  lbs. 

In  both  cases  it  was  assumed  that  active  and  standby  systems  had  the  same 
rel lability.  The  reliability  of  a  redundant  system  or  component,  R^,  was 
computed  from 

=  1  -  (1  -  R)^ 

where  R  Is  the  reliability  of  the  non-redundant  unit. 

In  order  to  apply  the  Weibull  model,  the  reliability  at  the  2  year  point  must 
first  be  computed.  The  hazard,  L,  is  obtained  from  the  five  year 
reliability,  Rg  as 

L  =  (In  Rg)/5 

and  then  the  two  year  reliability  under  exponential  assumptions  becomes 
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These  values  are  tabulated  together  with  the  computed  Welbull  'a'  parameter 
In  Table  2-6.  The  other  entries  show  the  predicted  Welbull  reliability  at  5 
years*  R'^*  and  the  reliability  obtainable  when  individual  components  are 
made  redundant. 

TABLE  2  -  6  SUBSYSTEM  PARAMETERS  USING  WEIBULL  ASSUMPTIONS 


Component 


Rel iab. 
at  2  yrs 


Welbull 
•a’  param. 
(years; 


Welbull 
Rel iab. 
at  5  yrs. 


Rel iab. 
for  redund. 
component 


13.60 

8.51 


0.99+ 

0.99 

0.97 


The  series  reliability  for  the  three  components  <s  0.702  which  Just  meets  the 
minimum  requirements.  If  Just  component  A  is  made  redundant*  the  reliability 
becomes  0.74*  comparable  with  the  configurations  discussed  for  the 
exponential  case*  and  at  a  weight  increment  of  only  100  lbs. 


.  V  V 
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Chapter  3 


CAUSES  OF  FAILURE 


By  understanding  the  causes  of  failure  the  users  of  this  report 
may  be  able  to  modify  the  baseline  reliability  prediction 
proc?dures  in  the  light  of  their  mission  or  equipment 
characteristics.  If  conditions  that  cause  a  specific  class  of 
failures  are  absent  for  a  given  application,  then  the  failure 
prediction  can  Le  currespondingly  reduced.  Conversely,  if  a 
cause  of  failures  is  more  pronounced,  then  the  failure  prediction 
will  have  to  be  increased.  One  of  the  most  constructive  uses  of 
reliability  prediction  is  as  a  design  tool:  to  identify  the 
configurations  that  yield  the  highest  reliability  within  given 
constraints.  In  this  connection,  knowledge  of  the  causes  of 
failure  can  be  effectively  employed  to  improve  the  reliability  of 
new  as  well  as  existing  designs. 

By  way  of  providing  background  for  the  treatment  of  causes  of 
failure,  the  first  section  of  this  chapter  describes  how  failures 
on  spacecraft  are  diagnosed.  The  classification  of  causes  that 
was  already  briefly  described  in  the  preceding  chapter  is  then 
explained  in  detail  and  examples  of  each  type  of  failure  are 
provided.  Next,  differences  in  the  relative  frequency  of  certain 
causes  between  pre-1977  missions  and  later  ones  are  analyzed  and 
some  significant  trends  are  identified.  Finally,  the  association 
of  spacecraft  subsystems  with  the  major  causes  of  failure  is 
investigated. 


The  principal  tools  for  diagnosis  of  spacecraft  failures  are 
-  Tel emetry 


-  Analysis  of  spacecraft  operation 

-  Retrospective  analysis  after  subsequent  anomalous  events  are  observed 

When  spacecraft  are  returned  to  the  earth  there  Is  of  course  an  opportunity 
for  direct  diagnosis  of  the  failure.  Only  for  very  few  of  the  failures 
reported  here  was  the  latter  course  applicable.  Because  of  the  • 
^onomlc  and  national  security  Implications  of  spacecraft  failures* 
supporting  Investigations  are  usually  carried  out  as  soon  as  any  off-nadnal 
operation  Is  observed. 

Most  spacecraft  are  heavily  Instrumented  In  order  to  permit  monitoring  of 
their  operation,  taking  corrective  measures  when  unusual  events  are  observed, 
and  detecting  design  weaknesses  that  can  be  avoided  In  future  launches  and 
designs.  Instrumentation  takes  the  form  of 

-  Measurements  of  the  environment  (primarily  temperature  and  radiation 
levels)  and  of  supporting  functions,  such  as  electric  power,  comnon  time 
bases,  and  attitude  control 

-  Normal  outputs  of  each  payload  function,  e.  g.,  sensor  outputs  from 
meteorology  and  earth  observation  satellites 

-  Specific  diagnostic  measurements  In  both  the  payload  and  supporting 

functions,  including  intermediate  outputs  of  all  sensor  processing  and  of 
housekeeping  functions  (e,  g.,  attitude  error’,  and  local  temperature, 

vibration,  and  pressure  measurements  for  pressurized  components. 

Satellites  which  are  In  continuous  contact  with  a  ground  station  can  use 
direct  telemetry  for  sending  the  data  to  the  monitoring  facility.  Satellites 
which  are  not  In  continuous  ground  contact  (this  Includes  most  missions  In 
low  orbits)  must  first  record  the  data  for  later  downlinking  In  a  compressed 
time  frame  when  they  are  In  station  contact.  The  tape  recorders  required  for 
this  procedure  were  themselves  a  very  frequently  falling  component. 

As  a  result  of  the  availability  of  monitoring  data,  anomalies  are  often 
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diagnosed  before  they  affect  the  operation  of  the  spacecraft  or  of  the 
mission.  In  many  cases*  procedures  can  be  initiated  to  prevent  further 
progression  of  the  malfunction*  and  in  some  cases  even  remedial  action  is 
possible*  e.  g.*  when  a  high  battery  temperature  is  noted*  the  load  on  the 
battery  can  be  reducea  and  the  battery  might  be  reconditioned  by  subjecting 
it  to  controlled  charge  and  discharge  cycles. 

Analysis  of  spacecraft  operations  is  another  important  source  of  failure 
information.  Examples  are  loss  of  power  in  a  communication  link*  incoherent 
sensor  output*  or  failure  to  execute  a  command  thai:  had  been  stored  or  sent. 
Tracking  data  can  be  used  to  diagnose  malfunctions  in  propulsion  and  attitude 
control  subsystems.  The  combination  of  spacecraft  operations  and  telemetry 
can  be  a  very  effective  diagnostic  tool*  e.  g.*  by  sending  commands  to  the 
spacecraft  that  exercise  functions  believed  to  be  implicated  in  the 
malfunction*  and  by  correlating  out-of-spec  telemetry  data  with  spacecraft 
rotation*  spacecraft  orbital  position  (relative  to  the  sun  or  to  the  earth)* 
or  other  periodic  spacecraft  activities. 

Retrospective  analysis  can  bo  used  to  assign  causes  to  malfunctions  that  had 
originally  gone  undiagnosed.  The  most  common  occurrence  is  that  one  or  more 
similar  malfunctions  are  observed  in  other  spacecraft.  Just  the  multiple 
observation  of  identical  events  will  usually  indicate  that  a  design-related, 
cause  is  involved.  Multiple  observations  will  also  permit  identification  of 
common  features  of  the  anomalies*  e.  g,*  all  occurring  on  exiting  from  an 
eclipse  or  all  following  transmission  of  a  specific  command.  Finally*  the 
diagnosis  of  one  malfunction  based  on  telemetry  and/or  spacecraft  operations 
can  furnish  clues  for  retrospective  assignment  of  causes  to  previously 
observed  occurrences  of  the  same  type. 

Ground-based  support  of  satellite  failure  diagnosis  consists  of  analysis  of 
the  on-orbit  data  (telemetry,  tracking,  and  operational),  simulations  (based 
on  analytical  models  or  utilizing  suspected  hardware  components),  and 
re-inspection  of  residual  hardware  (e.  g.,  components  procured  for  future 
launches  or  excess  inventory  for  a  current  satellite)  or  of  equivalent 


hardware  (components  or  parts  of  the  same  type  and  date  of  manufacture).  The 
results  of  such  Inspections  sometimes  show  defects  In  parts*  workmanship*  or 
procedures  that  become  candidates  for  further  diagnostic  activities  of 
narrower  scope.  Sometimes  procedural  deviations  are  discovered*  e.  g.,  that 
parts  did  not  undergo  all  required  tests  or  that  the  test  might  have 
overstressed  the  part. 


As  Indicated  In  the  preceding  section*  the  diagnosis  of  spacecraft  failures 
Is  unique  In  that 

«  a  sizeable  effort  by  high  level  technical  personnel  Is  devoted  to  the 
diagnosis  of  most  failures 

-  because  of  the  Inaccessabll Ity  of  the  spacecraft  the  corpus  del Icti  can 
only  rarely  be  recovered 

The  latter  factor  suggests  that  the  diagnosis  of  any  one  malfunction  may  be 
subject  to  some  uncertainty.  On  the  other  hand,  the  comprehensive  nature  of 
the  data  collection*  analysis  and  reporting  effort  makes  aggregations  of 
spacecraft  failure  data  a  very  valuable  basis  for  statistical  evaluation.  In 
order  to  facilitate  meaningful  statistical  results,  fairly  broad  cause 
classifications  have  been  selected  so  that  a  population  of  at  least  100 
failures  exists  In  each  category.  This  Is  particularly  Important  when 
subclassifications  are  evaluated*  e.  g.,  the  distribution  In  time  to  failure 
after  launch  by  causes  that  was  presented  in  the  preceding  chapter.  The 
following  cause  classifications  were  selected  on  this  basis 

-  Design 

-  Environment 
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Parts 


-  Quality 

-  Operation 

-  Other  known 

-  Unknown 

In  some  evaluations  failures  due  to  parts  and  quality  are  treated  as  a  single 
entity*  and  the  same  Is  true  In  some  Instances  for  failures  due  to  operation 
and  other  known  causes.  In  ODAP  the  cause  of  failure  Is  expressed  In  key 
words  as  well  as  In  prose.  The  key  words  are  either  f4u1valent  to  those  used 
here  or  could  be  easily  translated  Into  them.  In  the  OOSR  reports  the 
failure  Is  described  In  prose  and  an  "Incident  Type"  Is  derived  from  this 
which  Is  classified  In  two  ways 

-  Electrical,  mechanical,  other,  and  unknown 

-  Catastrophic  part  failure,  other  part-related  Incident,  non-part- 
related,  and  unknown 

The  mapping  of  OOSR  reports  Into  the  cause  classifications  shown  above  relied 
primarily  on  the  prose  descriptions. 

The  classifications  which  are  of  primary  Importance  for  the  reliability 
prediction  of  electronic  components  are  design,  environment,  parts,  and 
quality.  The  conceptual  distinctions  between  these  causes  are  shown  In 
Figure  3-1.  Random  parts  failures,  which  are  the  core  subject  of  the 
MIL-HDBK-217  reliability  prediction  procedures,  are  In  the  present  data 
collection  usually  characterized  by 

-  the  failure  Is  traced  to  a  part  or  to  a  small  aggregation  of  parts 

-  there  Is  no  evidence  of  a  design  deficiency,  excessive  environmental 
stress,  or  of  a  quality  related  problem 


.iA  .ta. 


CAUSE 


ASSUMED  LOAD/STRENGTH  RELATION 


PARTS  (RANDOM)  E 


DESIGN 


ENVIRONMENT 


LOW  OR  STRENGTH 


LOW  OR  STRENGTH 


HOW  DIAGNOSED 


NON-REPETITIVE 
NO  OTHER  CAUSE  LIKELY 


REPETITIVE 

ANALYSIS  ESTABLISHES 
THAT  STRENGTH  IS 
INADEQUATE  IN  SOME 
CIRCUMSTANCES 


USUALLY  REPETITIVE 
ANALYSIS  SHOWS  LOAD  DUE 
To  ENVIRONMENT  TO 
EXCEED  ORIGINAL 
SPECIFICATION 


LOW  OR  strength 


QUALITY 


LOW  OR  strength 


USUALLY  REPETITIVE 
ANALYSIS  SHOWS  THAT 
VARIATION  OF  STRENGTH 
EXCEEDS  SPECIFICATION 


FIGURE  3-1  REPRESENTATION  OF  FAILURE  MECHANISMS 
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-  there  is  no  pattern  of  similar  failures 

Typical  part  failure  synopses  are  "Decryptor  B-side  power  supply  failure; 
suspect  intermittent  open  in  transistor  of  power  converter"  (ODAP  incident 
2426)*  or  "Solar  array  temperatures  appear  abnormal  but  no  effect  on  power 
output;  due  to  array  thermistor  failure"  (IDE  Incident  9  in  OOSR). 

Design  failures  can  be  of  two  types:  selection  of  parts  that  do  not  possess 
sufficient  strength  as  indicated  in  the  figure,  or  not  allowing  for  the  full 
range  of  spacecraft  operations.  An  example  of  the  former  is  "Sensor  circuit 
reset  while  using  back-up  encoder;  the  detectors  within  the  optical  decoder 
are  sensitive  to  Van  Allen  belt  energetic  particles."  (ODAP  incident  466) 
This  failure  occurred  in  1979  when  the  characteristics  of  the  Van  Allen  belt 
were  well  krown  and  should  have  been  considered  in  the  design.  The  report  on 
this  incident  also  references  another  problem  of  the  same  type.  An  example 
of  a  more  operations  related  design  deficiency  is  "Sunlight  entered  sensor  of 
electrons  and  photons  experiment,  causing  loss  of  about  50%  of  the  experiment 
data;  design  error  or  oversight  —  the  sensors  were  light  sensitive"  (ISEE-l 
incident  1) 

Environment  is  listed  as  a  cause  of  failure  where  unanticipated  environmental 
effects  were  encountered  or  where  the  magnitude  of  anticipated  events  was 
greater  than  specified  or  expected.  As  indicated  in  the  figure,  the  load  due 
to  the  environment  frequently  has  a  very  long  right  tail  which  causes 
occasional  failures  even  in  parts  or  components  which  were  correctly  designed 
according  to  the  original  mission  specification.  Although  the  load 
distribution  is  shown  here  as  normal,  it  may  actually  be  more  closely 
approximated  by  an  exteme  value  distribution  as  discussed  in  the  previous 
chapter.  The  significant  feature  in  either  case  is  a  long  right  tail. 
Examples  are  "Ionospheric  plar.ma  monitor  data  is  degraded,  apparently  caused 
by  static  charge  build-up  on  spacecraft"  (ODAP  Incident  500),  and  "Delayed 
restart  of  Operational  Linescan  System  (2  minute  compared  to  normal  15  -  40 
seconds).  May  be  due  to  unusual  pattern  of  proton  effects"  (ODAP  Incident 
508).  The  component  involved  in  the  first  example  had  been  designed  when 


Is 
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spacecraft  charging  was  a  little  understood  phenomenon,  and  therefore  the 
problem  Is  not  classified  as  Improper  design.  In  the  second  example  the 
environment  Is  specifically  described  as  unusual. 

Quality  Is  assigned  as  a  cause  when  there  are  repeated  failures  In  the  same 
part  or  assembly  that  cannot  be  attributed  to  design  or  environment  or  which 
correlate  with  quality  defects  found  In  populations  of  similar  parts.  An 
example  of  the  former  type  Is  "Shunt  voltage  In  power  conditioning  assembly 
Indicates  erratic  fluctuations.  Probable  cause  Is  opening  of  collector 
resistor  In  shunt  driver  circuitry.  Similar  problems  were  encountered  on  two 
previous  flights"  (ODAP  Incident  1342).  Correlation  with  ground  observations 
governed  tfie  classification  of  Viking  Lander  1  Incident  1:  "Telemetry 
Indication  of  reduction  In  Internal  pressure  of  radlothermal  generator  1. 
Traced  to  leakage  of  gases  Into  the  pressure  transducer  reference  cavity. 
Suspected  prior  to  launch  based  on  pra-launch  pressure  data."  Failures  that 
were  traced  to  Improper  test  or  that  were  test  Induced  were  also  placed  Into 
the  quality  category.  An  example  of  this  cause  Is  "Mass  deployment  telemetry 
switch  did  not  Indicate  that  boom  had  been  deployed.  Attributed  to 
deformation  of  actuator  during  ground  system  test.  Revised  tooling  and 
Installation  procedures"  (ODAP  Incident  43).  The  representation  of  this  cause 
of  failure  In  Figure  3-1  by  a  standard  distribution  of  strength  with  large 
variance  Is  a  very  general  Indication  of  the  failure  process.  In  practice. 
It  Is  more  likely  that  there  Is  a  bimodal  distribution  ard  failures  occur 
only  In  the  (anomalous)  low  strength  portion  of  the  population. 

As  had  already  been  Indicated  In  the  previous  chapter,  failures  classified 
Into  the  unknown  category  were  most  likely  due  to  parts.  This  1s  consistent 
with  the  ,  diagnostic  key  for  parts  failures  Indicated  In  Figure  3-1  — 
non-repetitive  and  no  other  cause  likely.  The  primary  criterion  thait  led  to 
placement  Into  this  category  rather  than  Into  parts  was  Insufficient  data  In 
the  reports.  Examples  are  "Faulty  multiplexer  no.  1  channel  caused  loss  of 
some  narrow  coverage  driver  TWTA  temperature  data.  Switched  to  redundant 
multiplexer"  (ODAP  Incident  25)  or. "Manifold  pressure  Increased  out-of-limits 
following  simultaneous  firings  of  +  pitch  and  -  roll.  Returned  to  normal 
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within  one  orbit"  (Nimbus-7  Incident  48) 


Failures  classified  as  due  to  operation  involved  sending  improper  commands  to 
the  spacecraft  or  faulty  ground  software.  Examples  are  "Temporary  drop  in 
battery  power.  Improper  reconditioning  (by  ground  command)  caused  cell 
failure.  Recovered  by  using  new  deep  charge  reconditioning  technique"  (ODAP 
Incident  1130)  or  "Address  accept  was  not  transmitted  during  upload.  If  a 
message  is  sent  to  spacecraft  within  6  seconds  of  receiver  turn-on,  the 
message  is  not  accepted.  Corrective  action:  wait  at  least  7  seconds  after 
receiver  turn-on  before  uploading"  (ODAP  Incident  1194).  Very  few  of  the 
failures  (less  than  1%)  wore  duo  to  faulty  on-board  software.  This  is  not 
too  surprising  because  only  two  of  the  major  missions  utilized  significant 
computer  programs  (contrasted  with  stored  telemetry  or  timing  routines).  An 
example  of  an  on-board  software  failure  is  "Large  yaw  error  while  switching 
central  processors.  Traced  to  software  fault;  rewrote  procedure"  (ODAP 
Incident  1326). 

The  classification  of  other  known  failures  includes  early  depletion  of 
consumables  (attitude  control  gas,  orbit  make-up  propellant),  wearout 
failures,  and  wiring.  Examples  are  "Radiometer  scan  drive  motor  showed  signs 
of  periodic  loss  of  speed  after  18  months  on  orbit,  may  be  due  to  old  age" 
(ODAP  Incident  1527)  or  "Sensor  lost  lock  on  limb  due  to  Increased  detector 
temperature  caused  by  depletion  of  the  cryogen"  (Nimbus-7  Incident  29). 

It  is  probably  evident  from  this  discussion  that  the  classification  involved 
some  judgement.  In  ODAP  this  led  to  the  assignment  of  multiple  causes  for 
some  failures,  a  practice  which  was  also  followed  in  this  report  (the  data 
base  allows  for  up  to  three  causes  but  this  limit  was  only  infrequently 
utilized).  One  result  of  the  multiple  classification  is  understatement  of 
the  relative  frequency  of  the  unknown  category  which  is  only  rarely  used 
together  with  any  other  cause  while  failures  in  the  remaining  categories  may 
be  counted  more  than  once  (but  only  for  the  purpose  of  classification). 


3.3  HISTORICAL  TRENDS  IN  CAUSES  OF  FAILURE 

For  the  purpose  of  reliability  prediction  it  is  of  interest  to  Investigate 
historical  trends  in  the  causes  of  failure.  If  the  recently  launched 
spacecraft  exhibit  a  drastically  different  failure  pattern^  then  this  should 
be  taken  into  account  in  the  prediction  methodology.  For  the  investigation 
of  historical  trends  the  spacecraft  were  divided  into  two  categories; 

-  Early  programs  —  where  the  first  launch  took  place  prior  to  1977 

r  Late  programs  —  where  the  first  launch  took  place  in  1977  or  later 

Spacecraft  in  the  latter  category  are  likely  to  utilize  medium  to  large  scale 
integrated  semiconductors  and  are  therefore  more  representative  of  the 
designs  addressed  by  future  reliability  studies.  It  must  be  recognized, 
however,  that  reliability  prediction  based  on  interpretation  of  field  data 
has  inherent  limitations  in  dealing  with  new  part  types  or  design  methods. 

The  distribution  of  causes  in  the  two  chronological  divisions  is  shown  in 
Figure  3-2.  It  is  seen  that  failures  caused  by  design  and  environment 
constitute  a  considerably  greater  proportion  among  the  late  programs,  and 
that  failures  due  to  parts,  quality,  and  unknown  causes  are  a  correspondingly 
smaller  proportion.  A  summary  of  aggregated  causes  is  shown  in  Table  3-1. 

TABLE  3-1  VARIATION  OF  CAUSES  WITH  DATE  OF  FIRST  LALWCH 


Cause 


Des  &  Env 
P,  Q  &  Unkn 


Fraction  of  All  Causes 
Early  Programs  Late  Programs 


Oper  t,  Other  .118 


~  A1  ~ 


NO.  OF  REPORTS  NO.  OF  REPORTS 


CAUSE 


FIGURE  3-2  DISTRIBUTION  OF  CAUSES 
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A  positive  conclusion  from  this  summary  Is  that  Improved  parts  selection  and 
quality  control  for  space  applications  seems  to  have  borne  fruit.  A  more 
surprising  finding  Is  that  advances  In  design  and  environmental  studies  do 
not  seem  to  have  kept  pace  with  the  demands  of  space  missions.  One 
explanation  Is  that  a  number  of  new  mission  types,  such  as  navigation,  have 
been  Introduced  and  that  many  of  the  design  problems  are  associated  with 
these.  New  part  types,  particularly  large  scale  semiconductor  memories,  also 
saw  their  first  use  In  space  In  the  lute  programs,  and  some  of  the 
environmental  failures  are  due  to  radiation  effects  on  these.  These  effects 
are  readily  seen  In  the  distribution  of  failures  by  subsystem  shown  In  Figure 
3-3  for  design  and  In  Figure  3-4  for  environment. 

Further,  a  part  of  the  Increase  In  design  and  environmental  causes  Is  due  to 
Improvdd  Instrumentation,  observation . and  analysis.  Failures  due  tc  -Known 
causes  have  decreased  from  over  2031  In  early  program  to  less  than  15%  t:,  late 
ones.  As  a  result  of  greater  experience  and  better  data,  failures  that  would 
have  been  undiagnosed  or  assigned  to  random  parts  failures  are  now  recongized 
as  due  to  design  problems. 

In  the  preceding  chapter  It  was  seen  that  design  and  environment  caused  a 
much  more  pronounced  and  continuing  decrease  In  the  failure  ratio  than  all 
other  causes.  Due  to  the  Increased  proportion  of  failures  caused  by  design 
and  environment  It  might  be  expected  that  the  failure  ratio  for  late  programs 
would  show  a  more  sharply  decreasing  trend  than  the  pattern  discussed  In 
Chapter  2  (particularly  Figure  2-4).  However,  this  could  not  be  verified 
partly  because  differences  In  the  mission  mix  made  It  difficult  to  Isolate 
effects  due  to  causes,  and  partly  because  the  late  programs  yielded 
Insufficient  data  for  times  on  orbit  1n  excess  of  three  or  four  years.  Since 
the  cut-off  date  for  this  report  was  January  1984,  no  spacecraft  launched 
after  January  1977  could  have  accumulated  more  than  7  years  In  orbit  and  only 
a  very  small  number  had  accumulated  five  or  more  years. 

The  overall  failure  ratio  for  late  programs  is  about  twice  as  large  as  for 
early  programs.  This  should  not  be  interpreted  as  a  decrease  in  reliability 
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FIGURE  3-3  DESIGN  PROBLEMS  BY  SUBSYSTEM 
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for  either  satellites  or  parts.  The  major  cause  of  the  higher  failure  ratio 

is  the  much  greater  complexity  of  the  satellites  launched  by  the  late 

programs.  At  least  three  factors  contribute  to  the  Increase  in  complexity 

-  Multi-mission  satellites*  e.  g.*  combining  earth  observation  and 

meteorology  or  providing  several  types  of  communications  on  one 

satellite) 

-  Higher  performance  and  accuracy  of  individual  missions,  e.  g.,  more 

channels  and  higher  signal-to-noise  ratio  for  communication  payloads, 
increased  accuracy  and  ease  of  use  for  the  navigation  function) 

-  Increased  use  of  redundancy  to  support  longer  mission  durations 

It  is  difficult  tfo  quantify  the  increase  of  complexity  in  terms  of  component 
or  parts  counts,  partly  because  the  data  are  difficult  to  obtain  but  mostly 
because  the  definition  of  parts  and  components  has  undergone  very  major 
changes,  particularly  in  the  electronics  field.  The  improved  ruggedness  of 
recent  satellites  as  a  whole  can  be  seen  from  the  greatly  reduced  fraction  of 
failures  that  are  in  the  high  severity  categories  (see  Chapter  4  for  a 
further  description  of  the  severity  classifications). 

TABLE  3-2  SEVERITY  OF  FAILURE  FOR  EARLY  AND  LATE  PROGRAMS 


Classification 
Code  Description 


Early  Programs  Late  Programs 
Count  Percent  Count  Percent 


1 

Critical  failure 

186 

10 

18 

3 

2 

Single  point  failure 

160 

8 

28 

5 

3 

Redundant  unit 

353 

18 

68 

12 

4 

Work-around  req’d 

339 

18 

101 

17 

c 

Degraded  performance  499 

26 

117 

20 

6 

Temporary  failure 

334 

17 

225 

38 

7 

Others 

52 

3 

32 

5 

Critical  failures,  which  terminate  the  operation  of  the  entire  satellite  or  a 
major  function,  represent  a  much  smaller  percentage  of  the  total  for  late 
programs.  Conversely,  failures  which  have  only  a  temporary  effect  on 
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satellites  operation  represent  a  much  higher  fraction  of  the  total  for  late 
programs. 


Although  there  are  noticeable  differences  In  causes*  failure  ratio  and 
severity  between  early  and  late  programs*  the  advantages  of  utilizing  the 
entire  data  base  for  reliability  prediction  outweighed  those  of  restricting 
It  to  the  late  programs.  The  advantages  considered  In  this  connection 
Included 

-  the  Incident  population  available  Is  approximately  four  times  as  large 
“  hazard  trends  could  be  evaluated  through  the  eighth  year  after  launch 

-  meaningful  sub-analyses  could  be  Investigated 

The  detailed  evaluation  of  failure  ratios  by  subsystems  and  missions  In  the 
next  section  and  In  the  following  chapter  permits  tailoring  of  the 
reliability  prediction  for  the  equipment  population  and  orbit  characteristics 
of  newer  satellites.  A  specific  case  Is  the  evaluation  of  navigation 
satellites*  a  mission  type  that  was  only  rarely  encountered  prior  to  1977. 


This  section  analyzes  for  each  of  the  major  cause  classifications  (a)  where 
the  failures  arise  (primarily  by  subsystem)  and  (b)  whether  there  are 
significant  differences  In  the  locate  of  the  failures  between  early  and  late 
programs.  The  data  presented  here  Identify  the  baseline  population  for  the 
reliability  prediction  procedures  cf  Chapter  5.  This  Information  may  be  used 
to  tailor  prediction  for  new  satellite  types  In  which  the  mix  of  subsystems 
and  functions  differs  significantly  from  previous  designs  but  specific 
tailoring  procedures  are  not  provided  as  part  of  this  report. 

In  each  of  the  following  subsections  the  distribution  of  causes  of  failures 
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among  spacecraft  subsystems  Is  Illustrated  by  means  of  bar  graphs.  The 
ordering  of  the  subsystems  along  the  horizontal  axis  Is  by  decreasing  failure 
contribution  In  the  total  satellite  population.  If  the  representation  of 
subsystem  failures  within  a  given  cause  corresponds  to  that  within  the  total 
data  base,  the  height  of  the  bars  will  decrease  from  left  to  right.  Any 
deviation  from  a  strictly  decreasing  pattern  Indicates  an  atypical 
contribution  of  subsy terns  within  a  given  cause.  Only  the  most  Important  ones 
of  these  deviations  are  commented  on. 


3.4.1  Design 

Because  failures  caused  by  design  constitute  the  largest  category  (almost  2S% 
of  the  total),  non-conformance  to  a  decreasing  pattern  among  the  bar  graphs 
Is  particularly  significant.  In  Figure  3-5A  which  encompasses  the  early 
programs  two  subsystems  have  a  clearly  excessive  representation:  thermal  and 
structures.  The  leading  causes  of  design  failures  In  the  thermal  subsystem 
were  Inadequate  thermal  models  during  the  first  decade  of  space  flight  and 
failure  to  account  for  deterioration  of  thermal  coatings  In  the  space 
environment.  Most  of  the  design  failures  In  the  structures  subsystem  were 
associated  with  deployment  mechanisms  (latches,  articulated  booms,  and 
separation  devices). 

As  can  be  seen  In  Figure  3-5B,  which  lllur  ' -tes  the  same  relation  for  late 
programs.  Improved  modeling  and  better  understanding  of  the  characteristics 
of  coatings  have  greatly  reduced  the  Incidence  of  design  failures  In  the 
thermal  subsystem.  There  has  also  been  a  considerable  Improvement  In  the 
structures  area  although  the  design  of  deployment  devices  continues  to  be  a 
source  of  failures.  The  data  management  subsystem  which  made  only  a  very 
small  contribution  In  the  early  programs  has  become  a  very  significant  cause 
In  late  programs.  The  main  reason  for  this  Is  that  there  were  very  few  data 
management  functions  In  satellite  designs  that  saw  their  first  launch  prior 
to  the  mid-1970s.  Data  management  systems  will  continue  to  Increase  In 
Importance  and  complexity  In  future  satellites,  and  the  contribution  of 
design  failures  In  these  should  be  an  area  of  concern.  Redundancy  which  Is 
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widely  used  to  permit  digital  equipment  to  be  used  in  critical  applications 
provides  only  limited  protection  against  failures  due  to  faulty  design.  The 
other  very  significant  change  from  the  pre-1977  experience  was  associated 
with  the  navigation  payload  subsystem  which  constituted  the  second  largest 
number  of  failures  in  Figure  3-5B,  Here,  again,  the  change  in  satellite 
functions  is  a  major  factor  in  the  difference  between  the  early  and  late 
programs.  However,  there  have  boon  some  unusual  reliability  problems  in  the 
navigation  payloads  as  further  discussed  in  Section  4.4.1. 

The  telemetry  subsystem  is  the  largest  contributor  to  design  causes  during 
both  periods  covered  in  Figure  3->.  The  percentage  of  total  design  failures 
due  to  this  subsystem  has  Increased  somewhat  in  late  programs.  The 
telemetry,  tracking  and  command  functions  in  recent  satellite  designs  are 
very  complex  and  there  is  no  Indication  that  this  trend  will  abate.  In  the 
context  of  reliability  prediction  the  telemetry  subsystem  is  one  of  the  more 
stable  spacecraft  components.  The  relative  contribution  of  the  guidance  and 
visual/infrared  sensor  subsystems  to  the  design  failures  is  much  less  in  late 
programs  than  in  early  ones.  In  both  cases  there  has  been  a  considerable 
maturation  in  system  design  and  a.  very  marked  improvement  in  component 
technology  which  permits  more  conservative  design. 

3.4.2  Environment 

The  general  trend  for  failures  due  to  environmental  causes  shown  in  Figure 
3-6  is  very  similar  to  that  found  for  design  causes.  In  early  programs  the 
thermal  subsystem  contributes  a  disproportionately  large  number  of  failures 
but  this  tendency  is  much  reduced  in  late  programs.  The  visual/infrared 
sensor ' subsystem  has  the  second  largest  number  of  failures  in  early  programs, 
largely  due  to  lack  of  knowledge  of  space  effects  on  optics  and  sensitive 
sensor  elet.ients.  The  contribution  of  this  subsystem  to  environmental  failures 
in  late  launches  is  much  less.  The  navigation  and  data  subsystems  show  up  as 
the  second  and  third  most  frequent  cause  of  failures  due  to  the  environment, 
and  this  is  again  related  to  the  greater  representation  of  these  systems  on 
recent  designs  and  the  lack  of  experience  on  space  effects  on  the 
components.  ^ 
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The  environmental  failures  In  the  telemetry  subsystem  are  partly  due  to 
unusual  space  effects  such  as  solar  flares#  but  another  significant  segment 
Is  due  to  electromagnetic  Interference.  Some  of  the  latter  arises  from 
equipment  aboard  the  spacecraft  but  a  large  amount  comes  from  terrestrial  and 
unknown  sources.  Fortunately  many  of  these  failures  affect  the  spacecraft 
only  temporarily.  The  power  subsystem  accounts  for  about  one-dighth  of  all 
environmental  failures  during  both  periods.  ,  Most  of  these  failures  are 
associated  with  solar  cells  and  battery  charging  circuits. 

3.4.3  Parts  and  Quality 

As  Indicated  In  Figure  3-7#  the  data  management  and  navigation  payload 
subsystems  are  particularly  significant  contributors  to  failures  due  to  parts 
and  quality.  Tne  navigation  function  has  the  largest  number  of  failures  due 
to  this  cause  among  late  programs  while  data  management  account  for 
approximately  15%  of  the  fa'^lures  In  both  time  periods.  The  communication 
payload  Is  a  significant  factor  In  early  programs  but  much  less  so  In  late 
ones.  Telemetry#  data  management#  and  the  navigation  payload  are  the  largest 
users  of  semiconductors  on  the  spacecraft#  and  therefore  the  distribution  of 
parts  and  quality  failures  shown  In  Figure  3-7B  Is  not  too  surprising, 

3.4.4  Unknown  Causes 

It  Is  seen  In  Figure  3-8A  that  for  early  programs  the  telemetry  subsystem 
accounts  for  35%  of  all  failures  due  to  unknown  causes#  a  proportion  that  is 
markedly  higher  than  seen  In  any  other  cause.  Part  of  the  reason  may  have 
been  lack  of  Instrumentation  In  this  f unci’  on  In  the  earlier  satellites. 
Figure  3-8B  shows  that  In  late  programs  the  unknown  failures  due  to  telemetry 
represent  only  about  one-half  of  that  fraction#  more  In  line  with  the 
representation  of  telemetry  In  the  remaining  causes.  The  power  subsystem  Is 
a  large  contributor  In  both  time  periods  but  particularly  among  recent 
programs.  Many  of  these  failures  are  associated  with  power  conversion 
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electronics*  a  function  that  is  apparently  not  well  instrumented 


The  guidance  subsystem*  visual/infrared  sensors*  and  special  payloads  are 
other  major  contributors  to  unknown  causes  in  recent  programs.  Among  the 
guidance  and  sensor  failures  are  many  that  cause  only  minor  disturbances  and 
which  might  conceivably  have  been  overlooked  on  earlier  flights.  The 
increased  contribution  of  special  payloads  is  largely  due  to  a  higher 
representation  of  this  category  in  recent  programs. 
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Chapter  4 

DETAIL  EFFECTS  MD  FACTORS 


This  chapter  presents  analyses  of  the  failure  severity »  of  the 
effects  of  complexity*  and  of  failure  rates  In  a  number  of 
partitions  of  the  total  satellite  population.  The  conclusions 
are  summarized  In  Table  4-1.  Section  4.1  discusses  failure 
distributions  by  severity;  Section  4.2  examines  partitions  based 
on  subsystems*  Section  4.3  analyzes  complexity  effects*  Section 
4.4  mission  effects*  and  Section  4.5  orbit  effects. 

TABLE  4-1.  RESULTS  OF  ANALYSES  PRESENTED  IN  THIS  CHAPTER 


PARTITION  EFFECTS 


SEVERITY 

SUBSYSTEMS 


COMPLEXITY 

MISSION 


Frequency  of  occurrence  Is  Inversely  related  to  severity 

Telemetry*  guidance*  and  electrical  power  are  the  largest 
sources  of  failures;  thermal  and  structural  subsystems 
are  among  the  lowest.  Differences  In  the  failure 
distributions  In  pre-  and  post-1977  programs  reflect 
maturing  technologies  In  some  subsystems  (e.g.*  guidance* 
communication  payloads)  versus  Increasing  complexity  In 
others  (e.g.*  data  management*  vIsual-IR) 

The  failure  rates  of  electronic  and  electromechanical 
subsystems  generally  decrease  as  a  function  of  time 
whereas  mechanical  subsystems  do  not  exhibit  such 
behavior. 

The  Importance  of  parts  and  quality  causes  Increases  with 
the  maturity  of  subsystems,  particularly  In  electronic 
subsystems 

Indicators  of  complexity  can  demonstrate  statistically 
significant  differences  In  failure  rates. 

Significant  differences  In  failure  rates  are  evident  for 
different  classes  of  missions. 


orbit  Low  orbit  (I.e.*  perigee  less  than  200,  km)  satellites 

have  a  higher  failure  rate  than  higher  orbit  satellites. 
However,  such  differences  can  be  accounted  for  by  payload 
and  specific  subsystems  characteristics  rather  than  by 
environmental  differences. 
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For  the  purposes  of  this  study,  the  severity  of  failures  was  categorized  as 
follows; 

1.  Critical  Failure  —  entire  satellite  or  a  major  mission  function 

falls.  Example:  Loss  of  S-band  and  Instrument  operation  due  to 
spacecraft  power  problems.  Attempted  work-around  but  to  no  avail 
(AEM-l,  Incident  11)  , 

2.  Single  Point  Failure  —  major  assembly  or  component  failure.  Example; 
No  output  from  sensor  25*  band  5  and  degraded  output  from  sensor  26. 
Loss  of  IR  data  causes  significant  mission  Impairment.  Periodic 
outgassing  performed  to  clean  sensors  but  not  successful  In  long  run 
(Landsat-3»  Incident  8) 

3.  Redundant  Unit  Failure  —  requires  activation  of  a  back-up  component  or 
system.  Example:  Command  clock  power  supply  #2  failed;  switched  to 
redundant  power  supply  but  only  one  command  link  now  open  (Landsat-2* 
Incident  16) 

4.  Work-around  —  failure  requires  change  In  operating  procedures  and  may 
cause  degraded  performance.  Example;  Auxiliary  command  memory  halted 
due  to  fixed  core  checksum  error.  Checksum  modified  to  accommodate  the 
error  (Landsat-3»  Incident  7) 

5.  Degraded  Performance—  failure  degrades  perforrtiance  of  a  mission 
function.  Example;  Threshold  problems  In  coastal  zone  color  scanner 
cause  loss  of  data  in  channels  1-4*  reducing  water  coverage  from  90%  to 
50  -  6056  (N1mb*JS-7,  Incident  1). 

6.  Temporary  failure  —  full  capability  restored  spontaneously  or  after 
recovery  procedure.  Example;  Stratospheric  sounder  scan  shifted  43 
counts  and  there  were  other  irregularities  in  the  command  logic. 
Mission  effect  was  small,  and  the  problem  has  not  recurred  (Nimbus-7, 
incident  13) 

7o  All  other  failures  —  usually  not  affecting  a  mission  function. 
Example:  Earth  resource  budget  scanhead  went  Into  a  forbidden  zone. 
Attributed  to  gimbal  motor  torque  margin  and  lubricant  viscosity. 
Negligible  effect  on  mission  (Nimbus-7,  Incident  17) 

Figure  4-1  shows  the  distribution  of  all  failures  by  severity.  It  is  seen 
that  failure  frequency  Is  inversely  related  to  severity,  I.e.,  serious 
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failures  occur  less  often  than  trivial  ones.  That  the  distribution  peaks  at 
category  5  rather  than  at  7  Is  probably  due  to  the  tendency  not  to  report  all 
failures  that  result  only  In  a  temporary  anomaly  or  that  have  no  significant 
effect  on  the  mission. 

Figures  4-2  through  4-5  Illustrate  the  distribution  of  failures  by  severity 
over  orbital  life  segments#  starting  with  the  first  month  on  orbit  and  going 
out  to  lifetimes  of  five  years  or  more.  It  Is  seen  that  this  distribution 
remains  roughly  the  same  for  the  first  three  Intervals  Investigated. 
However,  for  failures  occurring  after  5  years#  there  Is  a  marked  drop  In  the 
proportion  of  reported  failures  In  severity  categories  abovd  3.  As  already 
discussed  In  Section  2.2#  the  major  reason  for  this  appears  to  be  the 
decreasing  thoroughness  of  the  failure  reporting  procedures#  particularly  for 
m1.<s1ons  which  had  considerably  surpassed  the  Initially  estimated  lifetime 
and  for  which  operating  staff  may  have  been  reduced.  A  clear  Indication  of 
this  phenomenon  Is  that  the  mode  shifts  from  category  5  to  category  3.  The 
ratio  of  severity  4  and  higher  failures  to  those  of  severity  1-3  Is  2.4  In 
Figure  4-3  and  only  1.2  In  Figure  4-5.  The  total  data  loss  due  to  this 
process  Is  unlikely  to  be  more  than  60  failures. 


This  section  discusses  the  location  of  failures  In  terms  of  subsystems.  The 
following  11  subsystems#  listed  In  order  of  decreasing  failure  frequency#  are 
analyzed  (definitions  were  adapted  from  C0DAP84]): 

1.  Telemietry,  tracking,  and  control;  used  for  commanding  the  satellite  by 
receiving  ground  commands  and  decoding  and  distributing  them  to  other 
satellite  subsystems.  It  directs  steerable  antennas  and  transmits 
state-of- health#  tracking#  and  payload  data  to  ground  stations.  It 
Includes  tape  recorders  where  these  are  required  In  connection  with 
ground  communication.  The  name  of  this  subsystem  Is  sometimes 
shortened  to  'telemetry'  but  is  always  meant  to  include  the  total 
functions  Just  described. 
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FIGURE  4-1  DISTRIBUTION  OF  ALL  FAILURES  BY  CRITICALITY 


ftACTDN 


CRmClLlTY  CLi! 


FIGURE  4-3  DISTRIBUTION  Of  SECOND  MONTH  FAILURES  BY  CRITICALITY 
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FIGURE  4-5  DISTRIBUTION  OF  FAILURES  AFTER  5  YEARS  BY  CRITICALITY 
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2.  Guidance  and  stabilization:  used  for  Initial  satellite  guidance  In  the 
ascent  phase  and  orbit  acquisition.  It  may  then  be  used  for  keeping  a 
despun  platform  stationary  with  respect  to  the  earth  and  to  avoid 
sensor  lock  on  the  sun  and  moon.  It  provides  firing  pulses  to  the 
propulsion  subsystem  and  Is  Involved  In  spacecraft  stabilization, 
orbital  drift  corrections,  and  mid-course  corrections. 

3.  Electrical  power  and  distribution  (Including  solar  cells,  batteries  and 
thermionic  power  supplies):  generates,  stores,  conditions,  and 
distributes  electrical  power  to  the  other  subsystems. 

4.  \Msual-IR  sensors:  Earth  measurement  and  observation  In  the  IR  and 
visual  spectrum  (e.g.,  spectrophotometers,  radiometers,  scanning  and 
chopping  Interferometers,  and  vidicon  cameras 

5.  Data  management  (Including  CPUs,  timers,  and  memory):  stores  and 
processes  Instructions,  data,  constants,  and  other  parameters.  It  also 
Includes  software  packages  and  timing  functions. 

6.  Thermal:  regulates  the  temperature  In  various  compartments  of  the 
satellite  by  means  of  thermostats,  heat  pipes,  louvres,  heaters, 
coatings  and  cryogenics. 

7.  Communication  payload:  payload  on  board  communication  satellites. 
Including  antenna  pointing  and  de-spin  provisions 

8.  Specialized  payloads:  Primarily  scientific  and  surveillance  payloads 
not  Included  In  other  payload  categories. 

9.  Propulsion:  furnishes  thrust  for  orienting  the  spacecraft  and 
correcting  orbital  drift. 

10.  Structural:  consists  of  the  primary  structure,  protective  coverings, 
separation  mechanisms,  deployment  devices,  and  ordnance. 

11.  Navigation  payload:  payload  on  board  navigation  satellites 

The  telemetry,  power  distribution,  guidance,  thermal  control,  and  propulsion 
subsystems  are  present  on  all  missions.  VIsual-IR  sensors  and  special 
payloads  were  deployed  on  scientific,  meteorological,  reconnaissance,  earth 
resources,  and  surveillance  satellites.  The  communication  and  navigation 
payload  subsystems  were  used  on  communication  satellites  and  navigation 
satellites,  respectively. 


Section  4.2.1  discusses  the  distribution  of  failures  among  these  subsystems 
section  4.2.2  analyzes  the  time-dependence  of  failures  by  subsystems,  section 
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4.2.3  Investigates  causes  within  some  subsystems*  and  section  4.2.4  looks  at 
groups  of  subsystems  which  are  characterized  by  the  predominance  of 
electronic*  electromechanical*  or  mechanical  equipment.  Table  4-2  summarizes 
the  results  of  these  analyses. 

TABLE  4-2.  ,  SUMMARY  OF  SUBSYSTEM  ANALYSES 


SUBSYSTEM 

ACRONYM 

PREDOMINANT 

EQUIPMENT 

DECREASING 
FAILURE  RATE* 

PRIMARY  FAILURE 
MECHANISM** 

Tel emetry 

TELM 

Electronic 

Yes 

Design/Envmt 

Gui dance 

GUID 

Electromechanical 

Yes 

Design /Envmt 

Power 

POWR 

Electromechanical 

Yes 

Design/Envmt 

Parts/Quality 

Vfs.-IR  Sensors 

VI-S 

Electronic 

Yes 

Design/Envmt 

Parts/Quality 

Data  Mgmt. 

DATA 

Electronic 

Yes 

Design/Envmt 

Parts/Quality 

Thermal 

THER 

Mechanical 

No 

Design/Envmt 

Comm.  Payload 

COMM 

Electromechanical 

No 

Parts/Chial  Ity 

Special 

SPEC 

Electromechanical 

Yes 

Design/Envmt 

Parts/Quality 

Propulsion 

PROP 

Mechanical 

No 

Design/Envmt 

Structural 

STRUC 

Mechanical 

No 

Design/Envmt 

Nav.  Payload 

NAV 

Electronic 

No 

Design/Envmt 

»  Failure  rate  (I.e.*  no.  of  subsystem  failures 

per  mission  per  year)  that  shows  a  statistically  significant  decrease  over 
time  as  measured  by  a  correlation  coefficient  above  0.7  (see  section 

4.2.2) 

»*  Known  failure  mechanisms  were  divided  into  three  overall  categories; 
design/environment,  parts/quality,  and  other  (see  section  4.2.3) 

The  characterization  of  the  commulcation  payloads  as  an  electromechanical 
system  may  rt  first  appear  puzzling.  The  payload  Includes  in  many  cases  the 
sllprings  which  provide  the  connection  to  the  despun  portion  of  the  satellite 
and  in  other  Instances  the  steering  mechanism  for  antennas. 


4.2.1  Distribution  of  Subsystem  Failures 

Figure  4-6  shows  the  distribution  of  all  failure  reports  by  subsystem.  The 
subsystems  which  have  the  most  failures  are  all  complex  electronic  or 
electromechanical  systems.  The  low  falling  subsystems  Include  several  that 
are  active  for  only  a  small  portion  of  the  total  mission  t1me»  such  as 
propulsion  and  the  deployment  portion  of  the  structural  subsystem. 

Figures  4-7  and  4-8  show  the  Initial  (first  month  on  orbit)  and  first  year 
distribution  of  subsystem  failures.  Failures  In  the  vIsual-IR  subsystem  make 
up  a  larger  portion  of  the  earlier  failures  than  In  the  total  population 
whereas  the  power  distribution  and  communication  payload  failures  comprise  a 
smaller  fraction.  In  the  structural  subsystem  the  failure  of  deployment 
mechanisms  Is  clearly  responsible  for  the  unreliable  operation  during  the 
Initial  month  on  orbit. 

Figure  4-9  shows  that  excess  contributions  to  failures  after  five  operational 
years  are  distributed  In  the  opposite  way:  power  and  communications  payload 
are  high  and  the  structures  subsystem  Is  low.  Wearout  effects  In  batteries* 
solar  cells*  and  traveling  wave  tubes  are  believed  to  be  responsible  for  much 
of  the  unreliability  of  the  former  two  subsystems.  Wearout  or  depletion 
effects  way  also  be  responsible  for  the  relatively  large  number  of  failures 
associated  with  the  propulsion  subsystem.  The  small  contribution  of  the 
structures  subsystem  Is  due  to  the  static  role  of  the  structural  components 
In  the  steady  state,  orbital  phase.  In  Figures  4-10  through  4-13  the  failure 
contribution  of  subsystems  are  divided  Into  pre-1977  and  later  programs  (see 
Section  3.3).  Because  failures  from  the  pre-1977  programs  make  up 
approximately  75%  of  the  data  base*  the  similarity  of  their  failure 
distributions  to  the  overall  sample  Is  not  surprising.  The  failures  from  the 
late  programs  show  a  higher  proportion  associated  with  the  vIsual-IR*  data 
management*  special*  and  navigation  payload  subsystems.  The  former  two  can 
be  explained  by.  both  the  larger  number  and  Increasing  complexity  of  such 
systems  on  later  spacecraft;  the  latter  two  can  be  explained  by  the  larger 
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(b)  Post-1977  Programs 
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number  of  relevant  missions  (1.0.»  the  GPS  constellation). 

The  later  programs  demonstrate  the  progress  made  in  the  Implementation  of 
mature  subsystems  such  as  guidance^  electrical  power  supply  and  distribution, 
or  communication  payloads.  The  fraction  of  these  failures  Is  lower  than  In 
the  pre-i977  programs.  The  distribution  of  failures  In  other  subsystems  Is 
approximately  the  same  for  both  the  earlier  and  later  programs.  The  lack  of 
failure  reports  from  some  subsystems  In  the  late  programs  for  orbit  life  of 
five  years  or  more  (Figure  4-13B)  may  be  due  to  the  small  number  of 
satellites  that  have  completed  the  required  lifetime. 

4.2.2  Time  Dependence 

Figures  4-14  through  4-24  display  failure  ratios  (failures  per  mission  per 
year)  of  Individual  sffi'systems  as  a  function  of  time  on  orbit.  The  data  have 
been  calculated  and  smoothed  as  described  In  Appendix  A.  The  more  frequently 
falling  subsystems  (telemetry,  guidance,  electrical  power,  vIsual/IR  sensors, 
and  data  management)  have  decreasing  failure  ratios,  I.e. ,  their  reliability 
Improves  over  time.  However,  most  other  subsystems  (thermal,  communication 
payloads,  propulsion,  and  navigation  payloads)  do  not  exhibit  such  behavior. 

Table  4-3  shows  data  for  linear  regressions  within  each  subsystem  on  failure 

ratio  versus  time.  The  table  shows  that  where  the  slopes  are  statistically 

2 

significant  (defined  as  a  coefficient  of  determination,  R  ,  of  0.5  or 
greater)  they  are  always  negative.  Furthermore,  statistically  significant 
negative  slopes  are  primarily  found  among  the  subsystems  with  the  greatest 
number  of  failures. 

The  communication  payload  (Figure  4-20)  and  the  propulsion  subsystem  (Figure 
4-22)  exhibit  wearout  effects  which  are  consistent  with  the  known  equipment 
characteristics  of  these  functions.  Several  other  subsystems  show  no 
significant  time  dependency  of  the  failure  ratio  after  an  initial  period  of 
high  failures.  The  thermal  subsystem  (Figure  4-19),  the  structural  subsystem 
(Figure  4-23),  and  the  navigation  payload  (Figure  4-24)  are  among  these. 


-  77  - 


1  2  3  4  5  6  7  8 


YEAR 

FIGURE  4  -  17  TIME  TREND  OF  VISUAL  &  IR  SB-iSOR  SUBSYSTEM  FAILURES 


-  79  - 


Failures  per  Spacecraft'^Year 


TABLE  4-3  LINEAR  REGRESSION  ON  SLOPE  OF  FAILURE  RATIO 


SUBSYSTEM  R  Coeff  Std  Sigr.f.*  Intercept**  Slope** 

of  det.  Err. 


Telemetry  -.77 

.60 

.11 

.0001 

.43 

(.05) 

-.024 

( .005) 

Guidance  -.70 

.49, 

.07 

.0009 

.22 

(.03) 

-.011 

(.003) 

Power  -.85 

.73 

.06 

.0000 

.24 

(.03) 

-.017 

( .002) 

VIsual/IR  Sensors  -.88 

.78 

.04 

.0000 

.24 

(.04) 

-.014 

(.002) 

Data  Management  -.71 

.51 

.05 

.0006 

45 

(.02) 

-.009 

( .002) 

Thermal  Subsystem  -.67 

.45 

.03 

.0018 

.09 

(.01) 

-.005 

(.001) 

Communic.  Payload  -.49 

.24 

.06 

.0363 

.04 

(.03) 

.006 

(.003) 

Special  Payload  -.88 

.77 

.02 

.0000 

.09 

(.01) 

-.006 

(.001) 

Propulsion  .05 

.00 

.04 

.8460 

.04 

(.02) 

.000 

(.001) 

Structural  -.45 

.20 

.04 

.0543 

.05 

(.02) 

-.004 

( .002) 

Navig.  Payload  -.65 

.42 

.03 

.0028 

.07 

(.01) 

-.004 

(.001) 

*  F-dlstrl button  probability  that  such  results  could  have  been  due 
to  chance 

**  Quantities  In  parentheses  are  standard  errc's  of  the  estimate.  Units 
are  failures  per  6-months  per  mission 


4.2.3  Causes  of  Failure  within  Subsystems 

The  following  discussion  Is  concerned  with  causes  of  failure  within  each 
subsystem.  It  supplements  Section  3.4  in  which  the  contribution  of 
subsystems  to  each  cause  category  was  Investigated,  The  percentage 
contributions  of  major  causes  to  failures  within  each  subsystem  are  shown  In 
Table,  4-4.  Design  and  environment  failures  are  the  most  Important 
contributors  In  most  cases.  The  parts  and  quality  cause  Is  the  most 
significant  one  for  the  convnuni cation  payload  and  data  management  subsystems, 
both  of  which  employ  a  large  number  of  complex  electronic  components.  The 
same  pattern  might  be  true  for  telemetry,  visual /IR  sensors  and  special 
payloads  If  the  large  percentage  of  unknown  failures  In  these  subsystems  ,1s 
mostly  composed  of  parts  and  quality  causes. 


TABLE  4~4.  COMPOSITION  OF  SUBSYSTEM  FAILURES  BY  CAUSE 


SUBSYSTEM 

DESIGN 

ENVMT 

PARTS/ 

QUALITY 

OPER/ 

OTHER 

UNKNOWN 

Tel emetry 

21  .OX 

18.3« 

24.2* 

9.7* 

26.8* 

Guidance 

28. 7X 

20.7* 

20.7* 

11.1* 

18.7* 

Power 

21.4* 

20.4* 

19.8* 

17.8* 

20.6* 

Visual/IR 

20.5* 

23.0* 

15.2* 

8.9* 

32.4* 

Data  Mgt 

18.8* 

17.3*  , 

33.7* 

19.9* 

10.3* 

Thermal 

33.5* 

30.5* 

19.5* 

3.8* 

12.7* 

Communic. 

7.3* 

13.8* 

38.8* 

6.1* 

24.0* 

Spec  Pyld 

19.0* 

24.4* 

21.4* 

10.1* 

25.0* 

Structures 

43.0* 

17.4* 

24.0* 

5.8* 

9.9* 

Nav.  Pyld 

31.6* 

26.6* 

22.2* 

1C. 8* 

8.9* 

4.2.4  Electronic,  Electromechanical  and  Mechanical  Subsystems 

In  order  to  investigate  whether  the  time-dependent  failure  behavior  and  the 
causes  of  failures  are  affected  by  the  predominant  component  type  or 
function,  subsystems  were  grouped  intc  the  following  three  categories: 

ELECTRONIC  Telemetry,  Command,  and  Control 

Visual-IR  Sensors 
Data  Mi.  ''igement 
Navigation  Payload 

ELECTRO^€CHANICAL 

Guidance 
Special  Payloads 
Power 

Communication  Payload 

MECHANICAL  Thermal 

Propulsion 

Structural 

Figure  4-25  shows  the  contribution  of  each  of  the  groupings  to  the  failure 
causes.  Figures  4-26  through  4-28  show  the  relative  importance  of  various 
causes  within  each  of  the  groupings.  Although  design  and  environment 
failures  were  the  primary  causes  in  all  categories,  they  were  most  important 
for  mechanical  subsystems.  Unknown  causes  were  an  Important  contributor  to 
the  electronic  category  and  reflect  their  more  complicated  failure  modes. 


FAILURES 


Figures  4-29  through  4-31  show  the  time  dependent  behavior  of  electronic, 
electromechanical,  and  nechanical  subsystems.  There  is  no  statistically 
significant  slope  in  the  mechanical  category  (this  is  consistent  with  earlier 
results),  but  a  definite  negative  slope  is  present  in  the  case  of  electronic 
subsystems  which  indicates  2  decreasing  hazard.  A  smaller,  but  still 
decidely  negative  slope)  is  evident  for  electromechanical  subsystems.  This 
result  can  be  explained  by  the  presence  of  both  electronic  (decreasing 
hazard)  and  mechanical  (non-decreasing  hazard)  failure  mechanisms.  There  is 
some  evidence  of  wearout  among  the  mechanical  systems,  much  of  it  apparently 
due  to  the  propulsion  components. 

Since  the  primary  purpose  of  this  investigation  is  to  to  provide  a  basis  for 
improved  reliability  prediction  for  spacecraft  within  the  scope  of 
MIL-HDBK-217 ,  and  since  the  latter  deals  specifically  with  electronic 
equipment,  the  question  arises  whether  the  time  dependency  aspects  of  the 
prediction  procedures  should  be  based  on  the  total  population  of  failure 
reports  or  specifically  on  those  dealing  with  electronic  equipment.  In  this 
connection  it  is  necessary  to  make  a  distinction  between  electronic  equipment 
and  electronic  systems  or  subsystems  as  classified  in  the  earlier  portions  of 
this  section.  Electronic  equipment  is  the  preponderant  contributor  to 
failures  in  both  the  electronic  and  the  electromechanical  subsystems 
described  here,  but  it  is  not  a  significant  contributor  to  failures  in  the 
mechanical  subsystems.  A  comparison  of  the  failure  ratio  for  electronic  and 
electromechanical  systems  with  that  for  the  entire  population  is  shown  in 
Figure  4-32.  It  is  seen  that  the  general  time  trend  (which  determines  the  b 
parameter  of  the  Wei  bull  distribution)  is  identical  for  both  populations.  On 
detail  inspection  it  will  bo  noted  that  the  difference  between  the  two  graphs 
in  Figure  4-32  is  greater  at  the  beginning  and  at  the  end  than  in  the 
middle.  This  is  doe  to  the  large  proportion  of  failures  during  the  first 
year  and  to  the  wearout  effects  that  can  be  seen  starting  after  the  fourth 
year  in  Figure  4-31. 

The  primary  reliability  prediction  procedure  described  in  the  following 
section  is  based  on  the  Weibull  b  parameter  derived  for  the  entire  population 
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FIGURE  4-29  TIME  TREND  OF  FAILURES  IN  ELECTRONIC  SUBSYSTEMS 


FIGURE  4-30  TIME  TREND  Of  FAILURES  IN  ELECTROMECHANICAL  SUBSYSTEMS 
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because  this  was  simpler  to  implement  and  because  the  difference  in  the  time 
relationships  between  the  two  graphs  in  Figure  4-32  is  so  small.,  A  further 
consideration  is  that  the  validation  of  the  model  had  to  be  made  against  data 
for  an  entire  spacecraft  for  which  no  breakdown  b.«tween  electronic  and  other 
equipment  was  available. 


4.3  Complexity  Effects  in  Selected  Spbsy stems 

This  section  explores  the  relationship  between  reliability  and  subsystem 
complexity.  Because  complexity  involves  many  factors  (e.g.»  numbe^r  of 
components,  interconnections,  constraints),  it  is  difficult  to  develop  a 
direct  measure  that  is  unambiguous.  However,  other  more  easily  determined 
indicators  may  serve  as  useful  surrogates.  Table  4-5  shows  such  indicators 
for  subsystems  where  design  and  environment  were  the  most  Important  causes. 

TABLE  4-5.  COMPLEXITY  INDICATOR?  FOR  SELECTED  SUBSYSTEMS 


TELEMETRY  Presence  of  a  computer 

GUIDANCE  Nature  of  stabilization  (i.e.,  3-axis, 

spin-stabilized,  or  gravity  stabilized) 


POWER  Capacity 

THERMAL  Active  or  passive 


To  determine  whether  the  presence  or  absence  of  complexity  indicators  had  a 
statistically  significant  effect  on  the  failure  ratio  the  following  tests 
wore  performed; 


For  discrete  variables,  missions  were  grouped  Into  those  using  or  not 
using  the  indicator.  Upper  and  lower  90%  confidence  bounds  on  the 
failure  ratio  were  computed  for  each  group.  If  these  intervals  did  not 
overlap,  then  the  indicator  was  considered  significant.  The  technique 
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used  to  determine  these  confidence  Intervals  Is  taken  frc,n  Epstein 
CEPST60]. 

-  For  continuous  variables,  a  linear  regression  of  MTBF  versus  the  value  of 

2 

the  variable  was  performed.  If  the  coefficient  of  determination  (R  ) 
was  greater  than  p.S»  then  the  Indicator  was  considered  significant. 

In  most  of  the  subsystems  Investigated  the  more  complex  Implementation  was 
associated  with  a  much  higher  failure  rate.  It  Is  realized  that  the 
complexity  Is  Introduced  because  It  Is  essential  for  functional  or  accuracy 
requirements  of  the  mission.  Nevertheless,  the  significantly  lower 
reliability  of  the  more  complex  subsystems  should  be  considered  In  any 
trade-offs. 

4.3.1  Telemetry 

Data  on  whether  the  Telemetry,  Tracking,  and  Control  system  Included  either 
an  on-board  CPU  or  a  hardwired  encoder/decoder  unit  was  available  for  a  total 
of  101  flights  comprising  almost  3800  orbital  months.  As  shown  In  Table  4-6, 
CPU-based  systems  had  more  than  five  times  the  failure  rate  of  the  hardwired 
systems  (based  on  point  estimates  of  lambda).  Because  many  computer-related 
failures  are  less  severe  than  those  occurring  on  totally  hardwired  systems,  a 
second  analysis  was  performed  on  only  failures  of  the  three  most  critical 
classes.  These  results,  also  shown  In  Table  4-6,  confirm  that  there  Is  a 
significant  difference  between  the  failure  rates  of  CPU-based  and  hardwired 
systems  although  the  difference  (In  both  relative  and  absolute  terms)  Is  not 
as  large  as  when  all  failures  are  considered.  The  results  of  this  analysis 
demonstrate  that  complexity,  as  manifested  by  the  presence  of  an  onboard  CPU, 
affects  the  failure  rates  of  the  telemetry,  tracking,  and  control  subsystem. 
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Table  4-6  EFFECT  OF  COMPLEXITY  ON  TELEMETRY  SUBSYSTEM 


ALL 

CPU 

FAILURES 

Hardwired 

CRITICALITY 

CPU 

1-3  FAILURES 
Hardwired 

Time  In  Orbit#  years 

42 

272 

42 

272 

No,  of  Failures 

98 

no 

32 

45 

No.  of  Flights 

19 

82 

19 

,82 

Lambda#  per  year 

Point  Estimate 

0.19 

0.034 

0.063 

0.014 

Lower  Limit* 

0.16 

0.030 

0.049 

0.011 

Upper  Limit* 

0.23 

0.038 

0.077 

0.016 

*90*  Confidence  Interval 

4.3.2  Guidance 

The  complexity  of  guidance  and  stabilization  subsystems  was  characterized  by 
the  satellite  stabilization  method#  ,hree-ax1s  stabilization  being  the 
most  complex#  gravity  stabilization  being  the  least  complex#  and  spin 
stabilization  being  of  Intermediate  complexity.  Data  on  the  nature  of  the 
satellite  stabilization  system  were  available  on  a  total  of  180  flights  and 
6600  orbital  months.  Table  4-7  summarizes  the  results  of  the  analysis. 
Using  the  decision  rules  defined  at  the  beginning  of  this  section#  one  can 
state  that  subsystems  using  3-ax1s  stabllzatlon  had  a  significantly  higher 
failure  ratio  than  those  using  spin  stabilization,  and  that  the  latter  In 
turn  had  a  much  higher  failure  ratio  than  those  using  gravity  stabilization. 


TABLE  4-7  EFFECT  OF  COMPLEXITY  ON  THE  GUIDANCE  SUBSYSTEM 


3-ax1 s 

Spin 

Gravity 

Time  In  Orbit#  years 

132 

36C 

53 

No.  of  Failures 

81 

78 

2 

No.  of  Flights 

56 

78  , 

46 

Lanbda#  per  year 

Point  Est. 

0.61 

0.216 

0.038 

Lower  Limit* 

0.53 

0.18 

0.0072 

Upper  Limit* 

0.70 

0.24 

0.090 

*90*  Confidence  Interval 

4.3.3  Power 

The  capacity  of  the  power  supply  and  distribution  subsystem  was  not  a  good 

2 

Indicator  for  Its  failure  rate.  The  coefficient  of  determination  (R  )  was 
0.0007#  and  the  significance  of  the  F-dlstrlbutlon  was  well  below  the  90* 
decision  point.  The  probable  explanation  Is  that  larger  power  supplies  were 
placed  on  later  satellites  and  therefore  represented  a  more  mature 
technology.  Another  factor  Is  that  larger  capacity  power  systems  do  not 
necessarily  Involve  a  larger  number  or  more  complex  components.  Finally#  the 
percentage  of  the  power  system  capacity  utilized  may  be  less  for  large 
systems#  thereby  promoting  higher  reliability. 


4.3.4  Thermal 

The  use  of  active  thermal  control  (e.g.#  thermal  louvers#  heaters#  etc.) 
versus  total  ,  reliance  on  passive  measures  (reflective  and  Insulating 
coatings#  etc.)  was  the  basis  for  determining  thermal  subsystem  complexity. 
The  sample  consisted  of  83  flights  comprising  close  to  2900  orbital  months. 
Table  4-8  shows  that  the  point  estimate  of  the  failure  rate  for  the  passive 
systems  was  about  one-quarter  of  that  of  the  active  subsystems.  These 
conclusions  are  significant  at  well  over  the  90*  level. 


TABLE  4-8  EFFECT  OF  COMFIEXITY  ON  THE  WERMAL  SUBSYSTEM 


ACTIVE 

PASSIVE 

Time  In  Orbit,  years 

108 

131 

No.  of  Failures 

35 

11 

No.  of  Flights 

48 

35 

Lambda 

Point  Estimate 

0.32 

0.084 

Lower  Limit* 

0.25 

0.053 

Uppper  Limit* 

0.40 

0.012 

*90X  Confidence  Interval 


4.4  Mission  Effects 

This  section  discusses  the  results  of  analyses  by  mission  type  based  on  the 
following  four  mission  classifications: 


NAVIGATION 

OBSERVATION 


SCIENTIFIC 

COMMUNICATION 


Operational  navigation  satellites  (excluding  experimental 
launches  such  as  NTS). 

Meteorology,  Earth  Resource^  Reconnaissance  and 
Surveillance  satellites  ~  excludes  experimental  and 
research  missions  (which  are  Included  In  the  next 
category). 

Experimental  and  Scientific  launches  Including  both  NASA 
and  OoD  research  (as  opposed  to  operational)  missions. 

Commercial  and  military  communication  satellites. 


Table  4-9  shows  the  major  satellite  programs  that  were  Included  In  each 
category,  the  number  of  flights  and  failure  reports. 


TABLE  4-9  aASSIFICATION  OF  PROGRAMS  BY  MISSION  TYPE 


MISSION  GLASSIFICATION 

PROGRAMS 

NO.  FLIGHTS 

NO.  FAILURES 

NAVIGATION 

TRANSIT 

GPS 

22 

241 

OBSERVAFION 

OMSP 

ITOS  /  NOAA  /  TIROS  N 
LANDSAT 

METEOSAT 

NIMBUS 

SEASAT 

SMS 

TIROS 

VELA 

79 

912 

SCIENTIFIC 

ANNA 

ARIEL 

ATS 

BIOSAT 

DYNAMICS  EXPLORER 

ESAA 

120 

703 

EXPLORER 

GEOS 

GOES 

HCMM  (AEM  1) 

HEAD 

HERMES 

INJUN 

ISS 

lUE 

LE3 

MAG  SAT 

MARINER 

LUNAR  ORBITER 

OAO 

OGO 

PEGASUS 
PIONEER 
RANGER 
SAGE  (AEM  2) 

SOLAR  MAX 

SURVEYOR 

TORS 

USAF  SPACE  TEST  PROGRAM 

VANGUARD 

VIKING 

VOYAGER 
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TABLE  4-9  (continued)  CLASSIFICATION  OF  PROGRAMS  BY  MISSION  TYPE 


MISSION  CLASSIFICATION  PROGRAMS  NO.  FLIGHTS  NO.  FAILURES 


COMMUNICATION  DSCS  (II  AND  III)  100  490 

FLTSATCOM 
, IQCSP 

INTELSAT  (II,  III,  IV,  and  V) 

MARECS 

NATO  (II  and  III) 

SKYNET  (I  an.1  II) 

TELSTAR 

SYNCOM 

INSAT 

MARISAT 

3ATC0M 


Three  major  categories  of  subsystems  were  established  for  this  analysis: 

COWiON  ELECTRONIC  &  ELECTROMECHANICAL  Telemetry 

Guidance 

Power 

Data  Management 


COMMON  MECHANICAL 

Thermal 

Structural 

Propulsion 

PAYLOAD  , 

Visual/IR  sensors 

Navigation  Payload 

Coirmuni cation  Payload 

Special  Payloads 

Figure  4-33  depicts  the  results  of  the  failure  rate  detomi nations  by  mission 
type  and  subsystem  type.  Navigation  satellites  show  the  highest  failure 
rate,  results  which  reflect  primarily  the  GPS  constellation  (the  only  other 
navigational  satellite  program  was  the  relatively  sir.iple  TRANSIT).  Earth 
observation  satellites  had  the  next  highest  failure  rate,  a  reflection  of  the 
complexity  of  the  instrumentation,  telemetry,  and  guidance  systems  on  many  of 
these  missions.  The  lowest  failure  rates  were  in  the  communication 
satellites.  This  reliability  can  be  attributed  to  their  previously  notec 
technological  maturity.  This  rank  ordering  of  failure  rates  is  consistent 


FAILURE  RATE  -  PER  YEAR 


total  elec  mec  payload 


NAV  obs  SCI  ED 

FIGURE  4-33  FAILURE  RATES  BY  MAJOR  SUBSYSTEM  AND  MISSION  CATEGORIES 
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across  the  subsystem  groupings  defined  above. 

Table  4-10  and.  Figures  4-34  and  4-35  show  The  two  Wei  bull  parameters  for 
missions  and  subsystems.  Navigation  and  communication  satellites  have  beta 
values  of  greater  than  0.5  (I.e.,  fallut  rates  are  not  decreasing 

significantly  through  the  mission  life)  on  both  an  overall  basis  and  for  most 
subsystems.  With  the  exception  of  sclontlflc/exper Imental  missions,  the  beta 
values  for  mechanical  subsystems  are  also  greater  than  0.5.  These  results  are 
discussed  for  the  Individual  mission  types  In  the  following  subsections. 

TABLE  4  -10  WEIBULL  PARAMETERS  FOR  MISSIONS  AND  SUBSYSTEM  GROUPINGS  ' 


ALPHA  (a)  BETA 

Est  Std.  Err  Est  Std.  Err 


NAVIGATION 


Total  Mist  on 

0.160 

0.159 

0.91b 

0.069 

El ectron i c/Elmech . 

0.261 

0.267 

0.876 

0.089 

Mechanical 

6.166. 

4.317 

1.894 

0.106 

Payl qad 

0.495 

0.443 

1.155 

0.114 

OBSERVATION 

Total  Mission 

0.087 

0.061 

0.389 

0.044 

Electronic /El mech. 

0.153 

0.095 

0.470 

0.061 

Mechanical 

1.875 

0.996 

0.738 

0.100 

Payload  , 

0.217 

0.196 

0.243 

0.055 

SCIENr^FIC 

Total  Mission 

0.080 

0.044 

0.175 

0.084 

El ectronl c/EI mec . 

0.150 

0.073 

0.220 

0.091 

Mechanical 

1.426 

0.752 

0.255 

0.120 

Payl oad 

0.611 

0.238 

0.378 

0.125 

COMMUNICATION 

Total  M1s'.1on 

0.397 

0.279 

0.668 

0.095 

Electronic/Elmech. 

0.419  , 

0.346 

0.463 

0.043 

Mochanical 

4.674 

3.806 

0.926 

0.372 

Payload 

3.024 

1.690 

1  J.17 

0.125 

TOTAL 


i'  ^'1 _ 
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FIGURE  4-34  WEIBULL  ALPHA  BY  MAJOR  SUBSYSTEM  AND  MISSION  CATEGORIES 
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FIGURE  4-35  WEIBULL  BETA  BY  MAJOR  SUBSYSTEM  AND  MISSION  CATEGORIES 
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4.4.1  Navigation  Missions 


Figure  4-36  shows  the  distribution  of  failure  reports  by  subsystem  In 
navigation  satellites.  The  largest  single  contributor  Is  the  navigational 
payload  subsystem.  Other  Important  factors  are  the  telemetry  and  data 
management  subsystems.  Table  4-11  shows  the  results  of  failure  rate 
calculations  for  the  entire  mission  and  major  subsystem  groupings. 

TABLE  4-11  FAILURE  RATES  FOR  NAVIGATION  MISSIONS 


TOTAL 

COMMON  SUBSYSTEMS 

PAYLOAD 

MISSION 

Electronic/ 

El ectromech. 

Mechanical 

SUBSYSTEMS 

Years  In  Orbit 

45 

Mo.  of  Failures 

241 

136 

18 

87 

Failure  Rate 

Point  Estimate 

5.33 

3.01 

0.40 

1.92 

Lower  Limit* 

4.89 

,  2.68 

0.2£ 

1.66 

Upper  Limit*  , 

5.77 

3.34 

0.52 

2.19 

*90%  Confidence 

Interval 

Figure  4-37  shows  the  behavior  of  failure  ratios  (failures  per  mission-year) 
over  time.  The  overall  mission  and  major  subsystem  groupings  do  hot  exhibit 
reliability  growth,  a  fact  confirmed  by  the  Welbull  curve  fitting  whose 
results  are  shown  In  Table  4-10. 

^  4.4.2  Earth  Observation  Missions 

s 

I  The  distribution  of  failures  by  subsystems  In  earth  observation  missions  Is 

I  shown  in  Figure  4-38.  Common  electronic/electromechanical  subsystems  are  the 

I  most  significant  source  of  failures;  telemetry  accounts  for  more  than  30%  of 

[  the  total.  Mission  payloads  (vIsual/IR  sensors  and  special  payloads)  make  up 

f 

I  about  one-quarter  ot  the  failure  reports.  The  high  percentage  of  telemetry 

1 
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FIGURE  4-36  FAILURES  BY  SUBSYSTEMS  IN  NAVIGATION  MISSIONS 


and  data  management  failures  Is  related  to  the  number  of  missions  which  are 
In  low  orbits  and  therefore  Include  magnetic  tape  recorders.  Table  4-12 
shows  the  results  of  failure  rate  calculations  for  earth  observation 


missions. 


TABLE  4-12  FAILURE  RATES  FOR  EARTH  OBSERVATION  MISSIONS 


TOTAL 

MISSION 


Years  In  Orbit  218 

No.  of  Failures  912 

Failure  Rate 
Point  Estimate  4.18 

Lower  Limit*  4.00 

Upper  Limit*  4.36 


COMMON  SUBSYSTEMS  PAYLOAD 

Electronic/  Mechanical  SUBSYSTEMS 
Electromech. 


*90*  Confidence  Interval 

4.4.3  Scientific  and  Experimental  Missions 

As  shown  In  Figure  4-41 »  scientific  and  experimental  satellites  exhibit  a 
strongly  decreasing  failure  ratio.  The  primary  explanation  Is  the  Importance 
of  electronic  and  electromechanical  subsystems  In  both  the  mission  and  the 
payload.  The  nature  pf  scientific  and  experimental  satellite  missions  Is 
such  that  design  and  environment  related  failures  are  also  much  more 
significant  than  In  other  mission  classifications.  The  decreasing  failure 
ratio  Is  consistent  with  this  explanation.  Table  4-10  shows  WelbulT 
parameters  for  this  mission  class. 

4.4.4  Communication  Satellites 

The  distribution  of  failures  by  subsystems  for  communication  satellites  Is 
shown  In  Figure  4-42.  The  common  electronic/electromechanical  subsystem 
grouping  accounts  for  the  largest  fraction  of  all  failures,  but  the  mission 
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FIGURE  4-43  TIME  TREND  OF  FAILURES  FOR  COMMUNICATION  MISSIONS 
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payload  Is  the  single  most  important  contributor.  That  tha  telemetry 
subsystem  is  less  important  is  evidence  of  both  the  well  understood  nature  of 
communication  satellite  telemetry  and  control  and  the  fact  that  most  of  these 
vehicles  are  in  geostationary  orbits.  Table  4-13  shows  the  result  of  failure 
rate  and  MTBF  calculations; 

TABLE  4  -  13  FAILURE  RATES  FOR  COMMUNICATION  MISSIONS 


TOTAL  COMMON  SUBSYSTEMS  PAYLOAD 

MISSION  Electronic/  Mechanical  SUBSYSTEMS 

Electromech. 


Years  in  Orbit 

324 

No.  of  Failures 

490 

286 

75 

129 

Failure  Rate 

Po^nt  Estimate 

0.66 

1.13 

4.32 

2.51 

Lower  Limit* 

0.62 

1.05 

3.76 

2.25 

Upper  Limit* 

0.70 

1.22 

5.06 

2.83 

•90*  Confidence  Interval 


As  shown  in  Figure  4-43»  cpmmuni cation  satellites  do  not  have  a  strongly 
decreasing  failure  ratio  overall*  but  the  failure  ratio  of  electronic  and 
electromechanical  subsystems  do  show  the  usual  reliability  growth.  The 
non-decreasing  failure  ratios  in  the  mission  payload  can  be  attributed  to  (a) 
the  presence  of  non-el ectronic  components  (e.g.,  mechanical  dospin 
assemblies*  pointing  mechanisms*  and  antennas)  and  (b)  wearout  in  travelling 
wave  tubes.  Table  4-10  shows  the  results  of  Weibull  parameter  estimations 
for  these  missions. 


4^5  Orbital  Effects 

The  following  orbit  classifications  were  used  for  this  analysis: 
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LOW 


Perigee  of  less  than  200  Km 

MEDIUM  Perigee  between  200  and  2000  Km 

HIGH  Geostationary  (perigee  of  35»000  Km) 

,  and  extraterrestrial  missions. 

Because  so  many  factors  are  related  to  orbit  (e.g.*  mission  type»  satellite 
complexity*  etc.)*  analyzing  all  mission  failures  with  respect  orbit  could 
lead  to  erroneous  results.  For  example*  any  differences  between 
geosynchronous  and  extraterrestrial  satellites  are  more  likely  to  be  due  to 
mission  differences  (primarily  communications  versus  scientific)  than  to 
effects  of  the  trajectory. 

Therefore*  the  analysis  of  orbit  effects  was  restricted  to  the  telemetry 
subsystem  which  had  a  considerable  commonality  between  missions.  Table  4*14 
summarizes  the  result  of  the  analysis.  When  all  failure  reports  were 
Included*  there  were  statistically  significant  differences  In  the  failure, 
rates  of  the  low  orbit  satellites  on  one  hand*  and  medium  and  high  orbit 
satellites  (between  which  there  was  little  difference)  on  the  other.  Further 
Investigation  of  these  trends  revealed  that  magnetic  tape  recorder  (MTR) 
related  Incidents  accounted  for  approximately  one-third  of  the  low  orbit 
failures*  one  quarter  of  the  medium  orbit  failures*  and  less  than  5%  of  the 
high  orbit  failures.  The  need  for  MTRs  Is  a  consequence  of  the  satellite 
orbit  but  Is  not  a  directly  related  physical  effect.  Thus*  the  analyses  were 
repeated  for  low  and  medium  orbit  missions  with  the  MTR-related  failure 
reports  censored.  Table  4-15  shows  that  the  failure  rates  without  the  MTRs 
are  practically  the  same  for  low  and  medium  orbit  satellites*  and  that  the 
failure  rate  for  high  orbits  Is  only  slightly  higher.  From  this  limited 
Investigation  It  Is  concluded  that  orbit  parameters  do  not  have  a  significant 
effect  on  the  failure  rate  of  telemetry  subsystems. 


TABLE  4-14  FAILURE  RATES  FOR  TELEMETRY  SUBSYSTEM  BY  ORBIT 


ORBIT 


WITH  MTR  (per  year) 

Pol nt  Upper  Lower 

Estimate  Limit*  Limit* 


WITHOUT  MTR  (per  year) 
Point  Upper  Lower 

Estimate  Limit*  Limit* 


MEDIUM 


1.11  0.91 

0.84  0.68 

0.72  0.62 


0.54  0.62  0.47 

0.54  0.60  0.47 

approx,  same  as  with  MTR** 


*90Jt  confidence  Interval 

**omy  11  out  of  269  total  failure  reports  were  reflated  to  MTRs 
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Chapter  5 

USE  OF  HIL-HOaK-217  FOR  SPACECRAFT  RaiABILITY  PREDICTION 


From  the  spacecraft  reliability  experience  and  its  analysis 
contained  in  the  preceding  chapters  a  reliability  prediction 
methodology  consistent  with  MIL-HDBK-217D  is  formulated.  The 
primary  procedure  combines  the  conventional  parts-based  model 
with  a  Wei  bull  term  to  account  for  the  decreasing  hazard 
phenomenon  that'  has  been  described  in  the  preceding  chapters. 
Alternate  procedures  are  provided  for  special  situations. 


5.1  SUITABILITY  OF  MIL~HDBK-2U.  METHODgOGY 

In  reliability  prediction  for  electronic  equipment  it  is  usually  taken  for 
granted  that  system  failures  are  due  to  failures  at  ;:he  part  level,  and  that 
the  latter  are  a  random  phenomenon  governed  by  the  exponential  failure  law. 
These  assumptions  are  also  the  basis  for  both  of  the  prediction  procedures  in 
MIL-HDBK-217.  As  discussed  in  sections  2.2  and  2.3  of  this  report,  only 
about  one-half  of  the  failures  observed  on  spacecraft  conform  to  this 
classical  failure  pattern  (those  classified  as  due  to  parts,  quality  .jnd 
unknown  causes).  The  other  one-half,  primarily  due  to  design  and 
environmental  causes,  exhibits  a  hazard  that  shows  a  pronounced  decrease  with 
time  on  orbit. 

A  challenging  part  of  the  work  reported  on  here  was  to  develop  a  reliability 
prediction  methodology  that  was  consistent  with  the  experience  of  the  space 
programs  and  yet  was  compatible  with  the  overall  approach  of  MIL-HOBK-217 . 
Adherence  to  MIL-HDBK-217  procedures  is  important  because  of 
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-  the  familiarity  of  Government  and  industry  personnel  with  that 
methodology 

-  the  considerable  investment  in  computerized  procedures  based  on 
MIL-HDBK-217 

-  the  need  to  utilize  the  parts  reliability  experience  that  is  being 
accumulated  in  applications  other  than  on  spacecraft 

The  latter  point  is  due  to  the  comparatively  small  number  of  electronic  parts 
failures  that  are  observed  on  spacecraft  and  the  even  smaller  number  which 
can  definitely  be  associated  with  a  specific  part  type.  There  were  fewer 
than  30  permanent  failures  in  memory  devices  for  the  entire  spacecraft 
population  surveyed  here,  and  none  of  these  devices  could  be  made  available 
for  a  post-failure  physical  analysis.  In  contrast*  the  RAC  MDR  series  of 
publications  reports  on  failures  of  several  thousand  memory  devices  each 
year*  most  of  which  occur  in  known  and  controlled  environments  and  at  least 
some  of  which  are  subjected  to  a  detaileo  post-failure  analysis.  An 
unpublished  study  by  a  major  supplier  of  spacecraft  reported  a  total  of  9.62 
failures  in  microcircuits  in  892  x  10®  part  hours  during  twelve  years  prior 
to  1983.  The  fractional  number  of  failures  is  due  to  allocation  among  part 
types  where  the  failure  was  attributable  to  one  of  several  parts.  For  all 
other  part  types  the  number  of  attributed  failures  is  even  less.  In  most 
cases  the  observed  orbital  failure  rates  are  within  a  factor  of  5  of  those 
predicted  by  MIL-H0BK-217D  procedures.  Since  the  base  failure  rates  for  some 
part  types  changed  by  approximately  the  same  ratio  between  the  C  and  D 
versions  of  MIL-HDBK-217  (published  in  1979  and  1982*  respectively)*  the 
parts  failure  predictions  generated  for  space  applications  appear  to  fall 
within  broadly  acceptable  limits. 

Nothing  in  the  data  studied  as  part  of  this  effort  indicates  that  the 
electronic  parts  failure  process  in  space  differs  from  that  in  other 
applications  once  the  proper  environmental  model  is  known.  Thus*  improved 
reliability  prediction  for  spacecraft  seems  to  depend  much  more  on  the 
develooment  of  accurate  thermal  and  radiation  models  than  on  the  modification 


of  the  parts  failure  rates  1r  MIL-HOBK-217  for  a  given  environment. 


An  Important  consideration  In  applying  MIL-HDBK-217  procedures  and  data  to 
space  applications  Is  how  the  past  predictions  compare  with  the  achieved 
reliability.  Accurate  data  for  this  purpose  are  very  difficult  to  obtain 
because  In  most  cases  a  complete  reliability  prediction  Is  made  only  very 
early  In  the  development  phase  (In  many  cases  In  connection  with  a  proposal) 
and  the  launched  .configuration  differs  markedly  from  that  which  was 
analyzed.  Some  spacecraft  contractors  maintain  updated  files  of  the 
predicted  reliability  but  these  are  usually  considered  proprietary  data.  One 
major  systems  company  made  data  without  attribution  available  to  this  study 
which  permit  a  comparison  of  predicted  (by  existing  MIL-HDBK-217D  procedures) 
vs.  achieved  ("demonstrated  at  50J6  confidence")  reliability  at  the 
spacecraft  level.  Failure  of  the  major  mission  function  was  equated  to  total 
spacecraft  failure  In  this  analysis.  A  graphical  representation  of  the  data 
for  two  programs,  each  Involving  multiple  satellites.  Is  shown  In  Figure  5-1. 
Because  of  the  extensive  redundancy  provisions  neither  the  predicted  nor  the 
demonstrated  reliability  follow  the  exponential  relation.  At  an  earlier  time 
(ca.  1975)  a  comparison  of  observed  vs.  predicted  reliability  had  beori  made 
for  a  number  of  programs  with  time  on  orbit  as  the  Independent  variable.  A 
summary  plot  from  that  study  Is  shown  In  Figure  5-2. 
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FIGURE  5-2. 


RATIO  OF  OBSERVED/PREDICTED  MTBF  VS.  MISSION 
DURATION 


The  experience  on  these  programs  Is  In  agreement  with  other  programs  for 
which  summary  data  were  obtained  part  of  the  Investigation  leading  to 
this  report  but  were  not  made  available  for  publication.  From  the  composite 
of  Figures  5-1  and  5-2  and  other  comparisons  that  were  examined  during  the 
course  of  this  project  It  Is  concluded  that  present  reliability  predictions 
for  spacecraft  overestimate  the  failure  rate  by  at  least  a  factor  of  two»  and 
that  the  excess  of  predicted  over  observed  failures  Increases  with  time  on 
orbit.  The  reliability  prediction  procedures  proposed  below  provide  correc¬ 
tions  for  both  of  these  difficulties. 


5.2  Proposed  Prediction  Procedure 

The  primary  procedure  which  Is  discussed  here  requires  knowledge  of  the 
spacecraft  mission  and  of  failure  rates  at  a  non-redundant  level  (typically 
parts  or  subassembly).  In  the  two  alternate  procedures,  which  are  described 
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In  a  lateir  section*  one  or  the  other  of  these  requirements  Is  waived. 

5.2.1  Derivation 

The  key  element  of  the  primary  procedure  Is  to  regard  the  on-orbit 
reliability  as  a  product  of  two  factors*  of  which  the  first  represents  the 
conventional  parts  failures  and  the  second  the  failures  due  to  mission 
effects  which  Include  design*  environmental  and  other  causes.  Thus*  the 
reliability  of  a  non-redundant  part  or  subassembly  Is  obtained  from 

"  *^parts  *  Mission 

where  *  exp(-Mt)  and  R^^gg^on  expCrt^’/a).  Reliability  predictions 

for  higher  levels  of  spacecraft  systems*  which  typically  Include  redundancy* 
can  be  generated  for  a  fixed  mission  time  from  the  above  reliability 
prediction  by  conventional  methods*  described  by  MIL-STD-756B*  Method  1001. 

The  data  presented  In  Figure  2-6  Indicate  that  the  two  factors 
(parts/qual Ity/unknown  and  design/environment)  make  an  equal  contribution  to 
the  total  spacecraft  hazard*  and  Figure  5-2  suggests  that  the  cross-over 
between  the  exponential  and  Welbull  components  occurs  between  20  and  30 
months  on  orbit  (2  years  has  been  used  In  the  following).  To  obtain  an 
overall  reliability  prediction  that  results  at  the  spacecraft  level  In 
one-half  of  the  failure  probability  obtained  by  current  methods*  the  first 
step  Is  to  determine  the  parameter  of  the  exponential  distribution*  here  M. 
It  Is  assumed  that  the  failure  probability  of  electronic  and 
electromechanical  systems  at  the  spacecraft  level  Is  dominated  by  redundant 
functions^  Thus,  1f  the  new  prediction  for  M  Is  to  result  In  one-half  the 
failure  probability  obtained  by  using  the  existing  methodology  with  parenieter  L 


1.  Very  few  of  the  essential  spacecraft  and  mission  functions  do  not  employ 
redundancy  (this  excludes  experimental  equipment,  sensors,  etc.}.  For  small 
equipment  segments,  such  as  the  memory  within  a  computer*  a  higher  level  of 
redundancy  may  be  employed  but  these  segments  do  not  make  a  significant 
contribution  to  the  total  spacecraft  reliability.  See  also  the  discussion  of 
Figure  5-5. 
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Cl  -  exp(-Lt)]^  =  2  Cl  -  exp(-2Mt)]^  5-2 

The  factor  of  2  in  the  exponent  on  the  right  side  of  the  equation  is 

necessary  because  the  exponential  part*  represented  by  M«  constitutes 

2 

nominally  one-half  of  the  total  failure  probability  Thus 

1  -  exp(-Lt)  =1.41  Cl  -  exp(-2MtJ3  5-3 

Using  only  the  first  two  terms  for  the  series  expansion  of  the  exponential* 
**x 

e  =  1  -  X,  one  obtains  the  approximation 

M=L/2.82  5-4 

Next*  the  parameters  of  the  Welbull  term*  can  be  determined.  Where 

the  specific  mission  type  is  not  known*  a  generic  assignment  of  b  =  0.12  may 
be  made  (see  Table  2-3*  this  assignment  Is  based  on  combining  design  and 
environmental  causes).  Where  information  about  the  mission  type  Is 
available*  more  accurate  assignments  of  b  can  be  made  from  the  following 
table.  The  values  for  the  b  parameter  shown  here  differ  from  those  In  Table 

4- 10  because  the  latter  were  computed  for  all  causes*  whereas  those  In  Table 

5- 1  were  computed  only  for  failures  due  to  design  and  environment  causes. 

TABLE  5-1  PREDICTION  FACTORS  BY  MISSION  TYPE 


Mission  Type  b  Ma* 

General  0.12  0.54 

Communication  0.4  0.66 

Navigation  0.9  0.93 

Observation  0.13  0.55 

Scientific  0.09  0.53 

»  See  below  for  the  use  and  calculation  of  this  factor 


2.  Equation  5-2  makes  use  of  the  approximation  R  =  1  -  Lt  which  Is  valid  only 
for  Lt  <<  1.  This  is  justified  because  the  prediction  procedure  is  applied  at 
the  parts  or  subassembly  level  at  which  the  Lt  product  is  Indeed  much  less 
than  one. 
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OiHd  b  15  known*  a  Is  computed  to  make  the  failure  probability  of  the  WelbuH 
term  equal  to  that  of  the  exponential  term  at  t  =  2  years.  For  the  generic 
case  of  b  «  0.12 


Hence* 


exp(-2M)  »  exp(“2  /a)  =  exp(-1.09/a) 


Ma  •  1.09/2  =  0.54 


where  M  Is  related  to  the  hazard  computed  by  the  current  MIL-HDBK-217D 
methodology  as  Indicated  In  equation  5-4.  For  the  general  spacecraft 
category*  the  statement  of  the  proposed  method  of  reliability  prediction  Is 
therefore 

R  =  exp(-Mt)  »  exp{-t°*^^)  =  exp{-M(t+t°*^^/0.54)l 


5.2.2  Procedure 

The  following  four  step  procedure  Implements  the  proposed  modification  of  the 
MIL-HDBK-217  reliability  prediction  for  spacecraft.  The  Military  Handbook 
for  Reliability  Prediction  of  Electronic  Equipment*  MIL-HDBK-217*  describes 
two  methods  for  reliability  prediction: 

-  Part  Stress  Analysis  which  accounts  for  the  detailed  thermal*  electrical 
and  operational  stresses  to  which  each  part  Is  subjected 

-  Parts  Count  Method  which  Is  based  on  an  average  stress  exposure  of  the 
parts 

In  both  approaches  the  application  environment  (such  as  space  flight)  is  an 
element  of  the  reliability  prediction.  The  stress  analysis  utilizes  an 
explicit  environment  factor  while  the  parts  count  provides  a  distinct 
grouping  of  failure  rates  for  each  application  environment*  thus  Including  an 
Implicit  allowance  for  the  environment  factor.  These  differences  affect  only 
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the  first  step  of  the  following  procedure. 

1.  Obtain  adjusted  exponential  hazard 

-  If  the  parts  stress  method  Is  used#  divide  Sp  by  2.82  (ref. 
equation  5-4) 

-  If  the  parts  count  method  Is  used,  divide  the  individual  failure 
rates  by  2.82 

2.  Enter  Table  5-1  to  select  b  and  Ma 

3.  Compute  a  from  a  »  Ma/M 

4.  The  complete  prediction  can  then  be  computed  from  ' 

R  =  Pxp(-Mt)  *  exp(-t^/a)  Re*,  eq.  5-1 

5.2.3  Validation 

To  validate  this  prediction  methodology,  it  will  be  applied  to  the  two 
spacecraft  programs  for  which  predicted  and  demonstrated  (achieved) 
reliability  had  been  depicted  in  Figure  5-1.  Figure  5-3  Is  a  repeat  of  that 
figure  with  the  addition  of  point  predictions  based  on  the  proposed 
methodology.  It  Is  not  precisely  known  what  satellite  types  are  represented 
In  these  figures  but  reasonable  guesses  lead  to  predictions  that  match  the 
demonstrated  reliabilities  rather  closely.  The  following  procedures  were 
used  to  generate  these  predictions: 

1.  the  original  prediction,  which  Is  based  on  the  exact  redundancy 
structure  and  component  count  used  In  the  spacecraft  Is  approximated  by 
a  prediction  for  a  hypothetical  spacecraft  consisting  of  five  major 
subsystems,  each  of  equal  complexity  and  each  being  redundant.  The 
reliability  of  this  hypothetical  spacecraft  is  given  by 

R  =  1  -  (1  -  e"‘-'^)^ 
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2.  L  was  selected  to  j^lve  a  good  ftt  to  the  exact  prediction  curve;  this 
was  achieved  at  L  »  0.14  for  both  parts  of  the  figure  (this  Indicates 
that  the  spacecraft  were  of  nearly  equal  complexity).  The  computed 
points  are  Identified  as  open  circles  In  the  figure.  Note  that  the 
calculations  based  on  dual  redundancy  provide  an  extremely  good  match 
to  the  prediction  based  on  the  actual  redundancy.  This  validates  the 
statement  made  In  connection  with  the  derivation > of  equation  5-2  that 
the  spacecraft  reliability  function  Is  dominated  by  dual  redundant 
elements. 

3.  Beta  was  selected  from  Table  5-1  and  a  was  then  computed  as  discussed 
above.  The  satellites  shown  In  part  A  of  Figure  5-3  were  evaluated  as 
communication  and  observation  satellites,  with  the  latter  giving  a  much 
better  fit.  The  satellites  shown  In  part  B  of  the  figure  were 
evaluated  as  communication  and  navigation  satellites,  with  the  latter 
giving  a  better  fit. 

The  methodology  proposed  here  requires'  only  minor  modifications  of  the 
existing  MIL-HDBK-217  data  and  procedures.  Its  chief  advantage  Is  that  It 
removes  the  systematic  underestimation  of  reliability  for  long  mission 
durations  which  Is  inherent  In  the  exponential  assumption.  A  further 
advantage  Is  that  It  distinguishes  between  the  principal  types  of  space 
missions  and  thus  permits  more  appropriate  estimates  to  be  generated  for 
each . 


The  preferred  reliability  prediction  procedure  described  above  may  be 
inconvenient  or  difficult  to  perform  In  two  environments 

-  for  the  general  electronic  equipment  manufacturer 
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A.  PROGRAM  A 


8.  PROGRAM  B 


FIGURE  5-3  PREDICTION  BY  PROPOSED  METHOD 
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In  the  early  stages  of  space  mission  planning 


Examples  of  general  electronic  equipment  are  telemetry  components#  power 
supplies#  and  audio  frequency  amplifiers.  Although  units  destined  for 
spacecraft  applications  may  represent  specialized  designs  and  will  receive 
special  care  In  assembly  and  test#  the  manufacturer's  reliability 
organization  may  find  It  very  difficult  to  Implement  a  completely  separate 
reliability  prediction  procedure  for  products  which  represent  only  a  small 
fraction  of  their  total  output.  In  many  cases  they  will  not  be  aware  of  the 
exact  satellite  application  category  In  which  the  units  will  be  used.  In 
this  environment  the  approximate  exponential  model  described  below  will  be 
preferred. 

In  the  early  stages  of  mission  planning  the  exact  equipment  complement  and 
redundancy  provisions  are  usually  not  known.  Therefore,  the  single  string 
reliability  prediction  Is  not  u?=«ful#  and  In  addition,  the  term  of 

equation  5-1  will  be  difficult  to  obtain.  In  this  environment  It  Is 
customary  to  base  reliability  prediction  on  the  achieved  reliability  of 
similar  spacecraft  or  major  subsystems#  and  extrapolating  for  longer  mission 
durations  where  that  Is  necessary.  The  single-term  Wei  bull  model  described 
below  Is  suitable  for  these  purposes. 

Because  the  alternative  procedures  will  In  general  yield  less  accurate 
predictions  of  space  systems  reliability  and  provide  less  Insight  Into  the 
effects  of  mission  or  component  changes#  the  primary  procedure  should  be  used 
wherever  possible. 
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5,3.1  Exponential  Approximations 


The  basis  for  the  piecewise  exponential  approach  Is  that  the  conventional 
reliability  prediction  methodology  Is  generally  regarded  as  workable  for 
spacecraft  operations  for  the  first  year  but  that  a  reduced  hazard  Is 
applicable  to  subsequent  years.  Because  reliability  prediction  Is  based  on 
the  hazard-time  product  (Lt  or  Mt  in  the  notation  used  here)  It  Is  simpler  to 
work  with  a  modified  time  rather  than  to  modify  the  hazaro  for  each  component 
type.  Thus*  an  approximate  reliability  prediction  can  be  generated  by 
reducing  the  'chargeable*  mission  time.  A  good  approximation  to  the  primary 
prediction  Is  obtained  by  using  40*  of  the  mission  time  after  the  first 
year.  The  reliability  equation  then  becomes 


As  seen  In  Figure  5-^  this  approximation  yields  a  fairly  good  fit  to  the 
primary  prediction  for  the  parameters  used  (M  =  0.25»  t  In  years).  Because 
of  the  small  differences  at  the  single  string  level  good  agreement  can  be 
expected  at  major  system  and  spacecraft  levels. 


A  convenient  Implementation  of  the  piecewise  linear  model  Is  by  means  of  a 
conversici  table  as  shown  below.  The  table  Is  entered  with  the  actual 
mission  time  for  which  the  prediction  Is  to  be  generated.  The  chargeable 
mission  time  1s  then  obtained  and  Is  used  In  generating  the  prediction.  The 
hazards  (lambdas)  or  hazard-time  products  can  be  added,  subtotaled,  etc.  in 
the  same  manner  as  hazards  and  hazard-time  products  of  the  conventional 
reliability  prediction  procedures.  Likewise,  the  existing  relations  for 
redundancy  remain  applicable  (with  chargeable  time  substituted  for  actual 
time). 
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FIGURE  5-5  SIMPLE  EXPONENTIAL  APPROXIfMTION  AND  DEMONSTRATED  REl.IABILITY 


TABLE  5-2  TIME  CONVERSION  FOR  PIECEWISE  EXPONENTIAL  PREDICTION 


Actual  ' 

Chargeable 

Actual 

Chargeable 

Time  (Yrs) 

Time  (Yrs) 

Time  (Yrs) 

Time  (Yrs) 

0.5 

6.S 

4.0 

2.2 

1.0 

1.0 

4.5 

2.4 

1.5 

1.2 

5.0 

2.6 

2.0 

1.4 

5.5 

2.8 

2.5 

1.6 

6.0 

3.0 

3.0 

1.8 

6.5 

3.2 

3.5 

2.0 

7.0 

3.4 

Simple  Exponential  Approximation 

An  even  simpler  procedure*  suitable  for  missions  of  up  to  5  years*  Is  to 
reduce  Sp  as  listed  In  MIL-HDBK-217D  to  one-half  of  the  stated  value.  This 
permits  use  of  all  existing  procedures  without  even  the  time  conversion  of 
Table  5-2.  The  comparison  of  using  one-half  of  the  given  Sp  with  both  the 
original  exponential  prediction  and  with  the  demonstrated  reliability  for 
Program  A  Is  shown  In  Figure  5-5.  The  1/2  Sp  correction  Is  applied  at  the 
single  string  level  and  propagated  to  the  spacecraft  level  by  using  the 
assumption  of  five  redundant  segments  discussed  In  Section  5.2.3.  This 
approximation  provides  very  good  agreement  with  the  observed  reliability 
until  the  fifth  year.  As  In  any  pure  exponential  assumption*  the  Incremental 
failure  probability  for  long  mission  durations  Is  substantially 
overestimated*  and  this  presents  a  problem  In  the  use  of  this  method  for 
mission  planning. 


5.3.2  Single  Term  Wei  bull  Prediction 

There  are  times  when  It  Is  necessary  to  generate  or  to  modify  reliability 
predictions  for  spacecraft  segments  which  Incorporate  redundant  components 
but  where  the  exact  structure  of  the  redundancy  provisions  Is  not  known.  A 


single  term  Wei  bull  reliability  model  of  the  form 

R  ■  exp(-t^/a) 

can  be  used  In  this  connection.  The  beta  parameter  Is  selected  at  0.75  which 
has  been  empirically  determined  to  give  a  workable  fit  to  the  demonstrated 
reliability  of  redundant  spacecraft  functions  or  entire  spacecraft.  The  a 
parameter  Is  selected  to  fit  a  known  reliability  at  a  specified  time,  or  to 
agree  with  a  prediction  arrived  at  for  short  orbit  times  by  the  exponential 
assumptions. 

An  example  of  the  former  approach  Is  shown  In  Figure  5-6.  The  solid  curves 
represent  the  reliability  of  a  redu«*dant  spacecraft  fjnctlon  by  the 
exponential  model  and  by  the  methodology  described  In  Section  5.2.  The  broken 
line  Is  the  single  term  Welbull  approximation  with  T  =  0.03  which  was 
selected  to  agree  with  the  proposed  prediction  at  ?  years.  As  an  example  of 
the  use  of  this  procedure  consider  a  guidance  system  for  which  the 
reliability  on  an  existing  satellite  has  been  demonstrated  to  be  0.85  for  two 
years  on  orbit.  What  will  be  the  reliability  of  this  same  system  for  seven 
years  on  orbit  on  a  similar  type  of  satellite? 

From  the  basic  equation  for  the  single  term  Welbull  model,  , 

0.85  =  exp(-2°*^^/a)  =  exp(-1.68/a) 

In  0.85  =  -  0.163  =  -  1.68/a 
a  =  10.3 

Now,  making  use  of  the  basic  equation  for  the  seven  year  prediction: 

R  =  exp (-7° 10 .3)  =  0.42 

Using  the  exponential  assumption  for  a  redundant  configuration  will  yield  a 
seven  year  reliability  of  0.33. 
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Appendix  A 


METHODOLOGY 


This  appendix  discusses  the  calculation  of  reliability  parameters 
from  failure  report  data.  The  first  Section  discusses  the 
formulation  of  the  Failure  Ratio*  the  key  parameter  that  Is  used 
for  the  description  of  spacecraft  reliability  experience. 
Section  A. 2  describes  the  generation  of  point  and  confidence 
Interval  estimates  for  the  Wei  bull  model. 


m 


A.l  Failure  Ratio 


The  key  reliability  parameter  utilized  In  the  body  of  this  report  Is  the 
failure  ratio*  defined  as  the  number  of  failures  reported  daring  a  period 
divided  by  the  number  of  operational  satellites  for  which  failure  reports 
were  obtained  at  the  beginning  of  this  period.  All  periods  are  referenced  to 
the  launch  date.  The  standard  period  Is  six  months,  but  this  was  modified  In 
some  Instances  as  discussed  below.  Thus,  period  2  extends  from  7  months 
after  launch  to  12  months  after  launch. 


The  failure  ratio  Is  an  approximation  of  the  hazard  which  Is  defined  as  the 
failure  density  function  divided  by  the  survivor  function 

h(t)  =  f(t)/{l  -  Fit)) 


The  number  of  surviving  satellites  was  computed  as  the  number  launched  minus 
the  number  that  had  become  non-operatlonal .  Since  all  failure  reports 
evaluated  as  part  of  this  effort  included  an  Identification  (sometimes  coded) 
of  the  satellite  from  which  they  were  obtained  there  was  no  problem  In 
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determining  the  total  number  of  satellites  launched.  Satellites  lost  as 
result  of  a  launch  vehicle  malfunction  were  censored  from  this  study.  The 
end  of  operational  'Ife  of  a  satellite  was  recorded  for  approximately 
two-thirds  of  the  population  In  one  or  more  of  the  chronologies  listed  under 
Data  Sources  -  Mission  and  Satellite  Data  In  Appendix  B.  For  the  other 
one-third  of  the  population  It  was  quite  difficult  to  determine  when  a 
satellite  was  no  longer  operational.  The  following  criteria  were  adopted  for 
declaring  a  satellite  non-operatlonal  when  specific  reports  were  no  available 

-  If  the  last  reported  failure  was  mission  critical  (criticality  1)»  the 
satellite  was  declared  non-operatlonal  as  of  the  date  of  that  failure 

-  If  the  last  reported  failure  was  not  mission  critical,  the  satellite  was 
assumed  to  have  survived  for  the  average  time  to  next  failure  of 
satellites  which  Incurred  a  non-critical  failure  during  the  period  of 
Its  last  reported  failure. 

This  procedure  gives  creditable  results  except  at  the  longest  orbital  life 
times  for  which  special  procedures  were  adopted  as  outlined  below. 

As  the  on-orbIt  time  Increased,  the  number  of  operational  missions 
decreased.  For  example,  there  were  297  missions  Initially;  after  9  years, 
there  were  only  17.  Thus,  the  sample  size  decreased  by  a  factor  of  15  with  a 
resultant  wide  variations  In  failure  ratios  which  resulred  strictly  from  the 
normalization  procedure  and  had  no  physical  reality.  Figure  A-1  shows  an 
example  of  such  fluctuations.  These  spurious  results  were  undesirable 
because  they  (1)  Increased  uncertainty  In  parameter  estimation  and  (2)  made 
visual  assessments  of  the  data  more  difficult.  Therefore,  after  5 
operational  years,  sampling  Intervals  were  Increased  first  to  1  year  (I.e.,  2 
periods)  and  then,  after  7  operational  years,  to  2  years.  Figure  A-2  shows 
the  results  of  such  smoothing. 

A  second  undesired  effect  of  normalization  was  an  apparent  depression  of  the 
Initial  failure  ratio  and  an  Increase  of  failure  ratios  In  late  periods.  The 
apparent  Initial  depression  was  due  to  the  large  number  of  Infant  mortalities 
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whtc.t  r.<2<i3  the  failure  ratio  at  the  beginning  a  poor  estimator  of  the  average 
number  cf  operational  spacecraft.  In  the  late  operational  periods,  the 
fluctuations  due  to  smaller  sample  sizes  noted  previously  may  cause  an 
apparent  Incre:^^  <n  the  failure  ratios.  Increasing  the  Interval  sizes  Is 
not  possible  because  there  are  too  few  surviving '  fl Ights,  In  order  to 
resolve  both  probl*ms,  failures  occurring  before  the  first  month  and  after 
the  102nd  month  were  censored. 
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A.2  WEIBULL  MQDEL 

The  Wei  bull  distribution  parameters  are  estimated  without  any  assumptions  on 
the  time  distribution  of  failures.  One  parameter,  known  as  the  shape  factor, 
determines  whether  the  failure  rate  decreases,  remains  constant,  or  Increases 
over  time.  The  second  determines  the  frequency  of  failures.  The  Wei bull 
distribution  can  be  expressed  as^ 


F(t)  -  1  -  expC~t“/a] 


(A-1) 


where  F<t)  Is  the  cumulative  failure  probability,  alpha  Is  the  scale 
parameter,  and  beta  Is  the  shape  parameter.  The  hazard  function,  h(t)  of  the 
Wei bull  distribution  Is  CLLOY77] 


b  t 


b-1 


h(t) 


(A-2) 


The  logarithm  of  equation  A-2  results  In  the  following  expression: 


In  h(t)  *  In  b  -  In  a  -  (b  -  1)  In  t 


(A-3) 


In  other  words,  the  logarithm  of  the  hazard  function  Is  linear  with  the 
logarithm  of  operational  time.  The  hazard  function  Is  simply  the  number  of 
failures  per  unit  time,  I.e.,  the  normalized  failure  ratio  defined  In  section 
A.l.  Thus,  a  regression  of  In  hit)  against  the  In  t  will  yield  a  slope  and 


1.  the  parameters  of  the  Wei  bull  distribution  are  listed  as  a  and  b  In  the 
equations  but  are  sometimes  referred  to  as  alpha  and  beta  In  the  text 
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Intercept  which  can  be  used  to  calculate  the  Wei  bull  shape  and  scale  factors 
as  follows: 


b  *  slope  +  1 
lna»lnb-1  ntercept 


(A-4) 


(A-5) 


Standard  errors  used  for  confidence  Intervals  can  be  calculated  by  using  the 
propagation  of  errors  method.  If  equation  A-4  Is  differentiated*  the  result 
becomes 


d  b  >  d  (slope) 

squaring  both  sides  and  substituting  the  squares  of  the  variances  for  these 
differentials  as  described  In  CBEVI72]: 


2  2 
var  (  b)  »  var  (slope) 


(A-6) 


The  standard  error  can  be  substituted  for  the  variance  In  this  expression,  and 
then  It  Is  seen  that  the  standard  error  of  beta  Is  the  same  as  the  standard 
error  of  the  slope.  A  similar  procedure  can  be  used  to  determine  the 
standard  error  of  alpha*  the  scale  parameter*  based  on  the  slope  and 
intercept.  Differentiating  and  squaring  equation  A-5  results  In 


-  -  2  - —  d  (Intrcpt)  +  ddntrcpt)^ 

-.2  b 


7  7 

std  err  (a)  =  a  C  var  (b)  -  2  cov  (b*  Intrcpt)  + 


var^( Intrcpt)]^  (A-7) 
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T/w  vw  \jym  ur*  v 


BEVI72 


LL0Y77 


P.  R.  Bevlnqton*  "Data  Reduction  and  Error*  Analyst s"* 
McGraw-Hin/New  York,  1972 

D.  K.  Lloyd  and  M.  Lipow,  "Reliability:  Management, 
Methods,  and  Mathematics",  Second  Edition,  by  the 
authors,  1977 


Appendix  B 


DATA  SOURCES  AND  UTILIZATION 


This  appendix  describes  the  origins^  nature  and  format  of  the 
data  used  in  this  report.  Section  B.l  discusses  data  sources  and 
section  B.2  describes  the  organization  of  the  SoHaR  space  data 
base.  ‘ 


This  section  discusses  sources  of  both  failure  data  and  mission 
descriptions. 


failunfi^RfiCoris 


Failure  reports  were  obtained  from  two  sources:  ODAP  and  OOSR.  ODAP  — • 
Orbital  Data  Acquisition  Program  generated  by  The  Aerospace  Corporation.  A 
sample  of  an  ODAP  report  is  shown  in  Figure  B-1. 

OOSR  --  On-Orbit  Spacecraft  Reliability,  PRC  Rreport  R-1863,  30  September 
1978,  and  Analysis  of  Spacecraft  On-Orbit  Anomalies  and  Lifetimes,  PRC  Report 
R3579,  10  February  1983,  A  sample  of  an  OOSR  report  is  shown  in  Figure  B-2. 
The  PRC  reports  also  contain  a  number  of  statistical  summaries  of  these 
data. 
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FIGURE  B-2.  EXAMPLE  OF  OOSR  REPORT 
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The  following  publications  were  used  for  descriptive  data  on  programs  and 
Individual  flights:  . 

L.J.  Abell  a  and  M.B.  Hoi  linger,  •'U.S.  Navy  In  Space:  Past,  Present,  and 
Future”,  lEEF  Trans,  iui  Aerospace  Md  El ectronic  Systems.  Vol  AES-20,  No.  4, 
July,  1984,  p.  325 

Launch  dates,  orbital  parameters,  launch  vehicles,  and  disposition  of  Navy 
related  satellites  Including  TRANSIT,  FLTSATCOM,  and  Space  Test  missions. 


F.W,  Buehl  and  R.E.  Hammerand,  A  Review  of  Communication  Satellites  and 
Re1 ated  Spacecraft  fon  Factors  Infl uencing  Mission  Success.  Vol.  II, 
Aerospace  Corp.  Report  No.  TOR-0076(6792)-II,  November,  1975 

Volume  II  of  this  report  contains  detailed  descriptions  of  many  pre-1977 
space  programs.  Included  were  parts  counts,  satellite  weights  and  power, 
subsystem  descriptions,  and  program  histories. 


E.S.  Epstein,  et.  al.,  "NOAA  Satellite  Programs",  IEEE  Trans,  on  Aerospace 
and  Electronic  Systems,  Vol  AES-20,  No,  4,  July,  1984,  p.  325 

Mission  histories  payload  descriptions,  and  current  status  of  NOAA 
Satellites. 


R.F,  Gould  and  Y.O.  Lum,  eds.,  A  Review  fif  Sate! 1 1 te  Systems  Technology.  IEEE 
Press,  1976. 

Descriptions  and  mission  histories  of  major  communication  satellite  proorams 
including  INTELSAT,  ANIK,  and  ATS. 


Charles  Hall,  "The  Pioneer  10/11  Programs;  from  1969  to  1994",  IEEE  Trans. 
Reliability,  Vol  R-32,  No.  5,  December,  1983,  p.  414 


wmmam 


Mission  history  and  payload  description  of  Pioneer  10/11  and  Pioneer  Venus 
programs. 


National  Aeronautics  and  Space  Administration*  NASA  Pocket  StattStlCS*  1979 

A  brief  description  of  all  NASA  missions  Including  launch  dates,  vehicles, 
orbital  parameters,  and  mission  disposition. 


TRW  Corp.,  IBi  Space  Logs.  1972,  1978,  and  1983  editions 

iRW  space  logs  provided  both  descriptive  Information  on  selected  satellites 
and  an  extensive  listing  of  launch  dates,  launch  vehicles,  and  orbital 
parameters. 


TRW  Corp.,  A  Compendium  gf  lEtt  Spacecraft  Reliability  DaJtfli,  Vol .  I,  1974 

This  report  contained  detailed  mission  histories,  failure  data,  and  parts 
descriptions  for  5  TRW  programs  Including  VELA,  INTELSAT  III,  OAO, 
Interplanetary  Pioneer,  and  Pioneer  Jupiter. 


B.2  ORGANIZATION  OF  THE  DATA  BASE 

The  spacecraft  failure  data  base  which  supported  the  analyses  of  this  report 
consisted  of  three  types  of  files: 

-  Failure  Reports:  Data  on  the  time,  severity,  cause,  and  affected 
subsystem  and  coniponents  of  Individual  failures. 

-  Descrl ptive  Data;  Data  on  mission  types,  launch  dates,  duration,  orbital 
parameters,  space  vehicle  characteristics,  launch  vehicles,  and 
Information  sources. 


Glossaries:  Descriptions  of  codes  used  for  causes,  subsystems,  parts, 
orbits,  and, launch  vehicles. 


All  failure  reports  were  contained  In  a  single  file  consisting  of  2613 
records.  Table  B-1  describes  the  fields  In  the  file.  ,  Descriptive  data  were 
contained  In  two  files.  The  first  contained  data  generic  to  all  missions  In 
a  given  program.  Table  B-2  describes  this  program  file#  which  had  a  total  of 
92  records.  The  second  descriptive  file  contained  data  on  Individual 
missions  (spacecraft)  as  shown  In  Table  B-3. 
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FIELD  NAME 

PROGRAM 

FLIGH7NO 

SUBSYSTEM 

CAUSEl 

CAUSE2 

CAUSES 

PART 

FAILTIME 

CRITICAL 

INCIDENT 


ma 


TABLE  B-1.  DESCRIPTION  OF  THE  FAILURE  ’’EPORT  FILE 

DESCRIPTION 

■Program  identification  number 

Flight  number 

Affected  subsystem 

Primary  (or  most  important)  cause 

Secondary  cause 

Tertiary  cause 

Affected  part  or  assembly 

Operational  month  in  which  failure,  occurred 

Criticality  level  (1-7) 

Incident  number  in  ODAP  or  PRC  data  base  for  traceability 
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TABLE  B-2.  DESCRIPTION  OF  THE  PROGRAM  DESCRIPTION  FILE 


FIELD  NAME 


DESCRIPTION 


PROGRAM 


AGENCY 

PROGSTART 

FIRSTLAUNCH 

LASTLAUNCH 


PROGTYP 


DESIGNR 


Program  Identification  number 

Program  name 

Project  managing  agency 

Year  in  which  program  was  initiated 

Launch  year  of  first  mission 

Launch  year  of  final  mission 

Program  type  (i.e.,  navigational,  earth  observation, 
scientific/  experimental,  or  communication) 

Prime  contractor 


DESLIFE 

PARTS 

WEIGHT 

POWER 


GUIDANCE 


COMMENT 


Design  life 
Total  parts  count 

In  orbit  weight  (excluding  expendables)  in  kg 
Beginning  of  life  power  in  watts 

Stabilization  technique  (i.e.,  3-axis,  gravity,  or  spin) 
—  if  different  techniques  were  used  on  some  vehicles  in 
the  program,  it  was  noted  in  the  comments 

Presence  or  absence  of  telemetry,  tracking,  and  control 
computer 

60-byte  field  for  Comments 
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TABLE  B-3.  DESCRIPTION  OF  •.'HE  INDIVIDUAL  MISSION  FILE 


FIELD  NAME 

DESCRIPTION 

PROGRAM 

Program  Identification  number 

FLIGHTNO 

Flight  number 

INTDESG 

International  designation  number 

YEAR 

Year  of  launch 

MO 

Month  of  launch 

LIFE 

Mission  life  time  (if  known)  in  months 

LASTRPT 

Time  of  last  failure  report  in  months 

CRITFAIL 

Time  of  last  critical  report 

TERMTIME 

Time  of  mission  failure  used  for  analysis  (set  equal 

to 

LIFE  if  known*  otherwise  the  procedure  described 
appendix  A  is  used) 

in 

COUNT 

Number  of  failure  reports  in  this  mission 

ENDCODE 

Final  mission  disposition  (i.e.*  operational,  terminated 
in  orbit,  launch  failure,  landed,  decayed) 

ERRSOURC 

Source  of  failure  reports  (i.e.,  ODAP,  PRC  or  other) 

ENDSOURC 

Reference  on  termination  time  and  disposition 

LAUNVEH 

Launch  vehicle 

PERIGEE 

Perigee  in  Km 

APOGEE 

Apogee  in  Km 

INCLINATION 

Orbital  inclination  in  degrees 

COf^ENT 

60-byte  field  for  corrments 
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